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ABSTRACT 

This study is the first in a series in which we analyze the structure and topology of 
the Cosmic Web as traced by the Sloan Digital Sky Survey. The main issue addressed 
in the present study is the translation of the irregularly distributed discrete spatial 
Q ■ data in the galaxy redshift survey into a representative density field. The density field 

H ' will form the basis for a statistical, topological and cosmographic study of the cosmic 

C/2 , density field in our Local Universe. 

^, ' We investigate the ability of three reconstruction techniques to analyze and inves- 

tigate weblike features and geometries in a discrete distribution of objects. The three 
methods are the linear Delaunay Tessellation Field Estimator (DTFE), its higher order 
^ ' equivalent Natural Neighbour Field Estimator (NNFE) and a version of Kriging inter- 

OO I polation adapted to the specific circumstances encountered in galaxy redshift surveys, 

OO . the Natural Lognormal Kriging technique. DTFE and NNFE are based on the local 

\l ' geometry defined by the Voronoi and Delaunay tessellations of the galaxy distribution. 

The three reconstruction methods are analysed and compared using mock 
f"~>« ■ magnitude-limited and volume-limited SDSS redshift surveys, obtained on the basis 

^^ \ of the Millennium simulation. We investigate error trends, biases and the topological 

structure of the resulting fields, concentrating on the void population identified by the 
Watershed Void Finder. Environmental effects are addressed by evaluating the density 
fields on a range of Gaussian filter scales. Comparison with the void population in the 
original simulation yields the fraction of false void mergers and false void splits. 
/\ ' In most tests DTFE, NNFE and Kriging have largely similar density and topol- 

H \ ogy error behaviour. Cosmetically, higher order NNFE and Kriging methods produce 

more visually appealing reconstructions. Quantitatively, however, DTFE performs bet- 
ter, even while computationally far less demanding. A successful recovery of the void 
population on small scales appears to be difficult, while the void recovery rate im- 
proves significantly on scales > 3 /i~^Mpc. A study of small scale voids and the void 
galaxy population should therefore be restricted to the local Universe, out to at most 
100 /i-^Mpc. 

Key v^rords: large-scale structure of Universe - cosmology: observations - methods: 
data analysis - methods: numerical - methods: statistical 



1 INTRODUCTION rangement consisting of dense compact clusters, elongated 

filaments, and sheetlike walls, amidst large near-empty void 

Over the past thirty years a clear paradigm has emerged as . .,,.., ,, . ,. , , • , i i .n i 

f ,,, ,,, regions, with similar patterns existing at higher redshitt, al- 

large redshitt surveys opened the window onto the distribu- , ., ,, i mi z-. • -nr i • ^i r i j_ i 

" J i- belt over smaller scales. 1 he Cosmic Web is the lundamental 

tion of matter in our Local Universe: galaxies, intergalactic .., ... r ,, irr ^ -, 

, , I , spatial organization ol matter on scales ot a tew up to a hun- 
gas and dark matter exist m a wispy weblike spatial ar- 



dred Megaparsec, scales at which the Univ erse still reside s 
in a state of mod er ate dynamical evo lution IPeeblea (|l980f ): 
IZel'Dovichl (|l970l ): iBond et al] l|l996l 'l. Its appearance has 
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been most dramatically illustrated by the recently produced 
maps of the nearby cosmos, the 2dF galaxy redshift survey 
(2dFGRS), the Sloan Digital Sky Survey ( SPSS) and the 
2MAS S redshift surveyslCoUess et alj (|2003h : lTegmark et all 
l|2004 ): lHuchra et all (|2005l '). 

According to the standard lore of structure formation, 
structures emerged from small perturbations in the primor- 
dial field of Gaussian density and velocity perturbations. 
Under the force of gravity these fluctuations grow and clus- 
ter to become the present day observed structures. At large 
scales the density field has been evolving (quasi)-linearly and 
still retains much information on cosmological parameters 
and structure formation. The linear density field provides 
an abundance of probes from which cosmological informa- 
tion can be extracted. Prominent probes are the cluster- 
ing of galaxies, which has been used to infer the un derlying 
primordial power spectrum of density fiuctuat ions (jPeeblea 
ll980l : |Percival et aL|[200ll : lTegmark et al.ll2004 ). the temper- 
ature and polaris ation anisotropics i n the Cosmic Micro wave 
Background (e.g. lSmoot et al.lll992l : rSpergel et al.ll2003l ) and 
the shearing of galaxy images by the gravitationally lensed 
photon paths through the inhomogeneous matter distri- 
butio n (|iy[ellieJ[l99a : iMassev et al.1 120071 : iHoekstra fc JainI 
120081 ). 

While these (quasi-)linear cosmological probes have 
yielded an impressive amount of cosmological information, 
the exploitation of the pronounced nonlinear patterns of 
the cosmic web towards probing cosmological parameters 
and cosmic structure formation has been less fortuitous. 
Even though the morphology, shape and other statistical 
characteristics of the quasi-linear cosmological density field 
forms a direct refiection of the structure assembly process 
in the Universe, on the corresponding small scales nonlin- 
ear growth has significantly altered and erased some of the 
essential cosmological information. The absence of an objec- 
tive and quantitative procedure for identifying and isolating 
clusters, filaments and voids in the cosmic matter distribu- 
tion has been a major obstacle in investigating the structure 
and dynamics of the Cosmic Web. The overwhelming com- 
plexity of the individual structures and their connectivity, 
the huge range of densities and the intrinsic multi-scale na- 
ture prevent the use of simple tools that may be sufficient 
in less demanding problems. However, various interesting 
new approaches and methods have been forwarded in the 
past few years, often based on ideas stemming from image 
processing, mathematical morph o logy, and med i cal imaging 
jAragon-Calvo et alllJOOTl . I2010I : ISousbid [ioTH : IWavll201ll : 
iGenovese et al.ll2010l . e.g.). 

This study is the first in a series in which we system- 
atically investigate the web-related structures found in the 
Sloan Digital Sky Survey (SDSS). It involves a systematic 
program in which we explore the cosmography of the Local 
Universe, assess the statistical characteristics of the density 
field, identify and categorize the voids and filaments within 
the SDSS galaxy redshift sample, and study the biasing of 
the galaxy population with respect to the mass distribution 
and the dependence of galaxy properties on the large scale 
environment. 

The intention of this paper is the reconstruction of the 
underlying (nonlinear) density field by translating the spa- 
tially irregularly distributed and discrete galaxy sample in 
the SDSS galaxy redshift survey into a representative den- 



sity field. The density field reconstructions described in this 
study will form the basis of a series of studies in which we ad- 
dress the statistical and topological properties of the cosmic 
density field in our Local Universe and in which we analyze 
the cosmography, void population and spinal structure of 
the local Cosmic Web in the Sloan Digital Sky Survey. The 
cosmographic description of the nearby Universe includes 
the identification and cataloguing of the filaments, voids and 
clusters in the SDSS galaxy distribution. The key motivation 
for the density field reconstruction techniques should there- 
fore be that it allows us to probe the complex quasi-linear 
(and nonlinear) structures that we find in galaxy redshift 
surveys. This involves the ability to reproduce the distinct 
anisotropic - filamentary and wall-like - features that make 
up the Cosmic Web, as well as the ability to trace the hier- 
archical substructure of the cosmic matter distribution and 
the ability to identify the void population that forms one of 
its most salient features. 

We are specifically interested in the performance of 
the fast and efficient linear tesse llation-based DTFE density 
field reconstructio n technique (ISchaap fc van dc W evgaerj 
I2OOO.; Schaap 2007: ivan de Weygaert fc S chaap 2008). In ad- 
dition, we investigate two additional higher-order techniques 
which are potentially suited for representing the quasi-linear 
density field of the cosmic web, the local natural neighbour 
interpolation (NNFE) and a nonlocal Kriging interpolation 
technique, the Natural Lognormal Kriging formalism. By 
means of an extensive comparison we evaluate which aspects 
of the density field are best reproduced by either DTFE, 
NNFE or Natural Lognormal Kriging, and which of these 
methods is best suited to function for further structural 
analysis. 

In order to be able to assess the reliability of the re- 
sults of our structural tools, it is crucial to understand the 
details and errors in reconstructed density fields. We will 
therefore present a detailed comparison study between three 
different reconstructions methods, and assess their density 
and topological errors over a range of scales. This will range 
form the small nonlinear scales, at 1 /i~^Mpc, to a scale 
of 10 /i~^Mpc, which represents the transition from quasi- 
linear to the linear regime. 

We focus on the translation of the galaxy positions 
in the SDSS DR6 galaxy redshift survey to a represen- 
tative density field within the survey volume. Currently, 
SDSS encloses the largest and deepest contiguous region of 
the nearby Universe mapped by a galaxy redshift survey. 
This makes the SDSS an ideal data sample for a full three- 
dimensional density field reconstruction. A first impression 
of the resulting DTFE density field map of a region in the 
SDSS DR6 survey is shown in figure [T] Our techniques will 
be generally applicable to any uniform galaxy redshift sur- 
vey. Even though survey limits and scales are different from 
that of the SDSS, it will be straightforward to carry over 
the results to other redshift surveys. 

1.1 Inferring Cosmic Density Fields 

The richest source of data for investigating the intricate web- 
like cosmic matter distribution is the distribution of galaxies 
in galaxy redshift surveys. We shall make the assumption 
that the galaxy distribution is a representative tracer of the 
underlying mass density field and of the underlying struc- 
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Figure 1. A visualisation of the (DTFE) SDSS density field (1 /i~^Mpc), the contour levels are divided roughly into the overdense 
and the underdense regime. Both the galaxies (blue dots) and t he density represent a slice of 12 h~^Mpc thickness. Some of the most 
pr ominent features have bee n named, like the Bootes SuperVoid I IKirshner et al.lll98ll) and the large supervoid {BS SuperVoid) identified 
bv lBahcall fc Soneiral l|l982l '). Also the largest overdense structure, the (Coma) Great Wall, and the location of the Hercules supercluster 



within the Wall are indicated. 



ture. Different samples selected from such catalogues may of 
course trace different structures. 

We set out to explore high fidelity methods for recon- 
struction of the density field from the discrete galaxy dis- 
tribution. We require accurate local density estimates that 
can be used to reliably compute structural and topological 
indicators. Computational efficiency is an important factor 
since our reconstruction technique has to be able to deal 
with galaxy samples of a million of galaxies as well as with 
N-body simulations comprising orders of magnitude more 
particles. 

There are two immediate and important implications 
and complications with respect to our intention of includ- 
ing the (quasi)-nonlinear components in the density field. 
The first one is that we have to be aware of the noise com- 
pone nts that tend to be included in the reconstructi ons as 
weU (|Schaad[2007l : Ivan de Weygaert fc SchaaDll2008D . The 
second one is that nonlinear data are typically not well be- 
haved, marked by strong gradients in the density field. It 
means that we have to take special care of those locations 
marked by non-linearities in the data. 



1.2 Spatial Point Processes and Continuous 
Density Fields 

We pursue the reconstruction of a density field from a spatial 
point process consisting of irregularly distributed points. We 
assume that the local intensity of the points is a fair tracer 
of the density and that these values are samples of a con- 
tinuous underlying density field. In doing so we appreciate 
that differently defined samples may trace different struc- 
tures. The reconstruction problem we address here is a data 
processing problem. The interpretation of the results is an 
astronomical process. 



When the spatial point process is defined by the galaxy 
distribution, the situation will be more complex. The cosmic 
web is marked by a distinct luminosity, colour and morphol- 
ogy segregation. A strong and systematic trend of galaxy 
morpho logy with densit y has been established a few decades 
ago by iDressleJ (|l98[l ). Early type galaxies preferentially 
in rich groups and clusters and late types galaxies residin g 
mainly in filaments and walls (e.g. iGiovanelli et. a]||l986l ). 
Other strong systematic clustering trends with luminosity 
and colour have also been estab lished, such as in t he com- 
plete SDSS DR7 galaxy sample (|Zehavi et. aLlbOloD . A fair 
sample of galaxies should in principle take this intrinsic seg- 
regation into account. We are pursuing this in a forthcoming 
publication. In this study, mainly intent on establishing our 
reconstruction technology, we will consider the total galaxy 
population. The reconstructed density maps are therefore 
unlikely to represent a fair reflection of the underlying dark 
matter matter network, but will nonetheless convey the over- 
all pattern of the large scale structure. 

The reconstruction consists of two fundamental steps. 
The first is the estimation of the local galaxy density. The 
second is the interpolation of these density values to obtain 
a continuous spatial density field. 

Following straightforward grid based interpolation 
methods and more sophisticated adaptive filter techniques, 
there has been a substantial investment into develop- 
ing more advanced techniques. Examples of recently in- 
troduced techniques to follow the multiscale nature of 
the Cosmic Web is the Multiscale Morphology Filter 
(jAragon-Calvo et al.l 20071') and the Hierarchical Spine Web 
technique ( Aragon-Calvo et al.ll2010l ) , a Morse theory based 
formalism that is closely re lated to the Watershed Void 
Finder bv lPlaten et al.l ((20071). An example of an alternative 
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Figure 2. The schematic outline of our cosmological reconstruc- 
tion procedure. The Point Sample is discussed in section [2] The 
Density Estimation item will be treated in section 11.2.11 & |3] 
and the Density Interpolation methods arc introduced in sec- 
tion 1131 iMl and [13] Section E) [7] and [8] deal with the Post- 
Processing. 



route involves the wavelet reconstruction bv lMarti'nez et al.l 
l|2005h . 

Here we concentrate in particular on the density estima- 
tion and the subsequent field interpolation, in anticipation 
of the post-processing and feature detection studies of the 
SDSS DR6/DR7 survey, the subject of the additional papers 
in this series. To deal with the shot-noise and necessary se- 
lection effects in the resulting density fields, the analysis will 
involve filtering, a necessary post- processing step. 

1.2.1 Density Estimation 

The first step in the reconstruction procedure is density es- 
timation. We show how this fits into our general scheme in 



Fi g. [2l A similar, but slight ly different outline can be found 
Citaura fc EnfilinI (|2008l '). 
In the liter ature we may find a large variety of density 



In tne liter ature we rnay nnd a large variety ot density 
estimators ( e.g. ISilvermanlllQSi : ISainll2002l : iKatkovniklliooa : 
[MiUeiJ [2003) • Usually they involve filters of some kind. De- 
pendent on the filter kernel, density estimates may have a 
local character or include the information from a wider range 
of points in a point distribution. Filter kernels may have a 
rigid scale and shape, or they may be adapting themselves 
to the local point density. An example of a global estimator 
is a Gaussian kernel, which takes along the information - 



be it weighted - from distant points. A well-known example 
of rigid local estimator s are the CIC and TSC grid interpo- 
lation formalisms fe.g. iHocknev fc Eastwoodlll981f ). A con- 
siderably more fiexible and adaptive filter kernel, frequently 
used in current N-body studies, is that of the spline-based 
interpolation r ecipes used in Sm ooth Particle Hydrodynam- 
ics codes (e.g. iMonaghanI Il992l ) whose scale is determined 
by the distance to the K^^ nearest neighbour. 

Recently, various local and adaptive density estima- 
tors have been shown to provide highly favourable re- 
sults for complex cosmological matter distributions, marked 
by a large dynamics range of scales and density values 
and intricate geometric patter ns. One of these exploits 
the adaptivity of the kP tree (jAscasibar fc BinnevI l2005l : 
ISharma fc Steinmeta l200q ). We also note the use of the 
Epanechni kov kernel estimator in the identification of su- 
perclusters lEinasto et al.l (|2007|). 

For the three reconstruction techniques addressed in 
this study, w e use the local DTFE den s ity es timate in- 
troduced by ISchaap fc van de Weygaertl (|200(J ) (see ap- 
pendix |B]). It sets the density value at a sample point 
proportional to the inverse volume of the contiguous 
Voronoi cell around a sample point, i.e.. the sum of 
the Del aunay tetrahedra to which the sample point 
belongs JSchaap fc van de Wevgaei^ I2OO0I : ISchaad l2007l : 
Ivan de Weygaert fc SchaaduOOSi ). Tessellation-based meth- 
ods are based on the realization that the optimal estimate for 
the spatial density at the location of a point x^ in a discrete 
point sample V is given by the inverse o f the v olume of the 
corresponding Voro noi cell lOkabe et al.l (120001). T hev have 
been introduced bv lBrownl ( 196a ) and lOrdI ( 19781 ) and were 
first used in astronomy, for the s pecific purpose of devis- 
ing so urce detection algorithms, bv lEbeling fc WiedenmannI 
(|l993h . 

In an extensive studv. lSchaad (|2007h demonstrated that 
the DTFE estimates are substantially better than those of 
the rigid grid based CIC and TSC techniques. Also, it out- 
performs the adaptive SPH spline performance, in particular 
in areas with substantial density gradients. 

1.2.2 Interpolation 

Interpolation on randomly scattered points aims to approxi- 
mate a continuous function constrained by the available data 
points. A wide variety of approaches have been put forward, 
each with their own advantages and disadvantages. These 
methods can be roughly divided into two categories, global 
and local methods. Local methods have the benefit that they 
are fast and able to deal with large data-sets. Global meth- 
ods tend to produce smoother interpolated functions but are 
computationally more expensive. 

An overview and discussion of spatial interpolation 

(c hapte r 3.4) and 



te chniques can b e found in iPressI (12007 ^___^ 
in IWatsod (|l992l ). We refer to iLombardil (|2002l ) for a de- 
tailed review of the s tatistical prop ertie s of a large num ber 
of techniques, and to lFrankd (|l982l ) and lAmidro"ir(|2002l ) for 
a detailed comparison between various methods. 

Amongst the more sophisticated interpolation tech- 
niques we can distinguish various classes: inverse distance 
based methods (IDW), moving least squares methods, the 
class of radial basis function int erpolation techniques (RBF) , 
Kriging interpolation methods (|Krigelll95ll : lMatherodll963l . 
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see sect. I4.3[l ) and triari Rulation based methods, both the 
Unear DTFE technique JSchaap fc van de Weygaertl l200d : 
Ivan de Weygaert fc Schaad 120081 . sect. I4.1[l and the higher 
order NNFE natural neig hbour interpolation technique 
(|Sibsonlll98ll : IWatson|[l995 sect. g^}. 



The crucial step of importance in our investigation is 
the interpolation step (see diagram Fig. [2]). Using the same 
initial sample point density estimates, the first step of the 
reconstruction procedure, we can assess the relative merits 
of DTFE, NNFE and Kriging interpolators. 



RBF and Kriging interpolation methods use predefined 
kernels to interpolate the field. The kernel in RBF methods 
is a basis function that spans the space of all the interpolat- 
ing functions. Kriging interpolation uses spatial correlations 
between sample points, with its kernel being equal to the 
cor responding covar iance function. The method, introduced 
by iMatheronI (| 19631 ). is the best linear unbiased estimator 
of a density value given a set of measured sample points 
at irregularly spaced points. Given the commonly accepted 
fact that the primordial density perturbation field is Gaus- 
sian, we may therefore see the Kriging interpolator as the 
natural choice for reconstructing the cosmological density 
fields which emerged out of these primordial Gaussian cir- 
cumstances. 

There is a distant relationship of Kriging inter- 
polation to Wiener filtering techniques (jWieneil 1 19491 : 
iRvbicki fc Presall992h . However, Wiener filtering is based 
on a different philosophy than Kriging, in that it includes 
a model for the noise and is evaluated in Fourier space. 
Also, classical Wiener filtering is predicated on an underly- 
ing Gaussian distribution. As a result, it has the serious dis- 
advantage of suppressing or substantially diluting nonlinear 
structures of interest. More advanced recent developments 
and applications of Wiener filters to the reconstruction of the 
density distribution have largely remedi ed its capacity for 
reconstructing the density dist ribution (JKitaura fc EnJJlinl 
l2008l : iKitaura et al.ll2009l . I2OI0I '). 

A rather different approach is advocated in the 
triangulation-based interpolation techniques. Both 

the lin ear Delaunay Tessellatio n Fie ld Est i mator 

fPTFE ISchaap fc van de Weygaertl I2OO0I : ISchaad l2007l : 
Ivan de Weygaert fc Schaad [2003) a nd the hig h er-order 
Natural N eighbour Field Estimator (|Sibsod Il98ll : IWatsod 
1 19921 : iBrau n fc Sambridgc 1995) use the neighbourhood re- 
lationships defined by the Voronoi and Delaunay tessellation 
of the point sample to establish a fully adaptive, irregular 
and local interpolation grid. DTFE uses the Delaunay trian- 
gulation to reconstruct in a self-adaptive, mass conservative 
and parameter free way the underlying spatial (density) dis- 
tribution. In combination with the DTFE tessellation-based 
sample points density estimates (see previous sect. I1.2.ip . 
the DTFE interpolation leads to a volume- covering density 
field which has been shown to recover the hierarchical as 
well as the anisotropic morphology of the C osmic Web 
(|Schaadl2007l : Ivan de Weygaert fc Schaap||2008l ') . 



1.3 DTFE, NNFE and Kriging 

The DTFE Delaunay Tessellation Field Estimator method 
will be compared to two other techniques having a similar 
potential for a proper reconstruction of the cosmic web. The 
NNFE Natural Neighbour Interpolation technique shares the 
local nature of DTFE, but involves higher order interpola- 
tions. The third formalism is Natural Lognormal Kriging is 
a version of Kriging interpolation, and thus involves a non- 
local higher-order interpolation methodology. 



1.4 Outline of this study 

We start by describing the data samples in section [2l the 
SDSS survey sample and a mock galaxy sample which mim- 
ics the SDSS, obtained from the Millennium simulation. The 
local DTFE density estimate at the location of each of the 
sample galaxies is the subject of section 11.2.11 In the sub- 
sequent section we present and describe each of the three 
interpolation techniques investigated in this study, DTFE 
in subsection 14.11 NNFE in subsection 14.21 and Lognormal 
Kriging in subsection 14.31 The comparison throughout the 
paper is based on one specific sample, the density field re- 
construction of the SDSS mock galaxy sample. Following 
a discussion in sect. [S] of the qualitative appearance of the 
density maps by each of the three methods, we turn to an in- 
tensive quantitative error and quality analysis of the density 
field in sect.[Sl In sect. [7] this is followed by an investigation 
of the topological structure of the weblike galaxy distribu- 
tion in the survey, mainly based on the void population as 
traced by the Watershed Void Finder. Section [5] presents the 
density field reconstruction of the SDSS data sample and in 
sect. (5] we summarize this study. 

Following this first paper in a series will be a statisti- 
cal study of the density field, focusing in particular on the 
one-point probability density function. Subsequently, we will 
present and discuss the cosmography of the reconstructed lo- 
cal Universe. Later studies will analyze the void population 
in the SDSS density field, and concentrate on the proper- 
ties of galaxies as function of the large scale environment as 
characterized by our technique. 



2 THE DATA 

Our study is based on two major datasets. The principal 
one is the genuine SDSS DR6 galaxy redshift survey. For 
the purpose of understanding the errors and artefacts in 
the density field reconstruction we use a set of galaxy mock 
catalogues that model this SDSS dataset. Th e mock samples 
are o btained from the Millennium simulation (jSpringel et al.l 
l2005h . 



2.1 The SDSS galaxy sample 

For our analysis we use the main galaxy sample in the North 
Galactic Cap from th e 6th data release o f the Sloan Digita l 
Sky Survey fSDSS) dstrauss et al.ll2002l: lYork e"rall|2000l : 



IStoughton et al.ll2002l : I Adelman-McCarthv et al.ll2008 

The SDSS DR6 data release consists of various con- 
tiguous regions. We restrict ourselves to the largest contigu- 
ous region, the northern strip of the North Galactic Cap 
(NGC) region. The sample was retrieved from the SDSS 
"casjobs server" ( [www.sdss.orgp using the SQL query in- 
terface. Relevant properties were downloaded and most of 
the post-processing was done on a workstation. We did not 
attempt to assign galaxies with missing redshift due to fiber 
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Figure 3. The top figure shows the SDSS galaxies in the X and Y coordinates out to a distance of 300 h~ Mpc. The lower figure shows 
a XZ slice perpendicular to the XY plane at Y=135 /i~^Mpc. The corresponding boundaries of both slices are indicated in the other 
figure, e.g. grey lines in the top figure show the Y limits of the bottom figure and vice verse. Both slices have a thickness 10 h^^Mpc. 



collisions. This gives a lower density estimated at the po- 
sition of the missing redshift. However, given the sampling 
scheme used to resample the density field to a regular grid, 
the probability of sampling one of the affected areas is very 
low. The problem occurs only in the high density regions of 
the catalogue. 

The spectroscopic SDSS galaxy sample is almost com- 
plete between a Petrosian magnitude limit of rrir — 14.5 and 
rrir — 17.77. We assume that the completeness of the sam- 
ple does not vary signific antly, even though th ere are in fact 
some angular variations (jBlanton et al.ll2003h . 

For our study, we select the SDSS galaxies that are lo- 
cated within a comoving box of 600 h~^Mpc. In terms of the 
survey coordinates (X,Y,Z) (see app. [XI for definition), the 
observer is located at (X,Y,Z) = (300., 0.300.) ft^^Mpc and 
the centre of the northern strip is rotated to lie parallel to 
the Y-axis starting at (X, Z) = (300, 300) h'^Mpc. In Fig.E] 
the galaxies are plotted in the XY-projection (top) and the 
XZ-projection (bottom). 



Within the (600 h^^Mpc)'^ volume there are a total of 
311474 galaxies with a magnitude less than the magnitude 
hmit rur = 17.77. 



2.2 The SDSS mock samples 

The Millennium simulation (jSpringel et al.ll2005l ) was used 
to construct galaxy mock catalogues which emulate the 
SDSS galaxy redshift sample. They are needed to get es- 
timates of the errors induced by the magnitude selection, 
redshift distortions as a result of peculiar velocities and er- 
rors resulting from the survey mask. 

We use the s emi-an al ytical galaxy samp les of 



De Lucia fc BlaizotI (|2007l 'l. iBower et all (120061 ) and 



Bertone et al.l ( 20071 ) to construct our own mock samples. 
Th ey are less d et ailed as the mock catalogue generated 
by iBlaizot et al.l ([20051), they are perfectly suited for a 
representation of all necessary aspects of the SDSS sample. 
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Figure 4. SDSS mock catalogues from the Millennium simulation. Top: full Millennium catalogue. Centre: magnitude limited mock 
catalogue. Bottom: volume limited mock catalogue. The mock galaxies are indicated by black dots. The blue and orange contour lines are 
NNFE density contour lines at density contrast p/pu = 0.25 (blue) and 1.0 (orange). For reasons of clarity, the top slice has a thickness 
of 6.0 h^^Mpc while the magnitude and volume limited slices each 12.0 h^^Mpc: showing a thinner slice for the two last samples would 
show too few galaxies. 
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The same (X,Y,Z) coordinate system and box size of 
600 /i~^Mpc have been used for the mock galaxy samples. 
The mock galaxy catalogues are constructed as follows: 

• Periodically tile the Millennium cube to obtain enough 
cosmic volume. 

• Calculate the redshift of the model galaxies wrt. the 
observer 

• Compute the apparent magnitude of each model galaxy 
from its absolute magnitude and redshift. 

• Select the model galaxies brighter than rrir — 17.77. 

• Add the peculiar velocity to the Hubble redshift to ob- 
tain the total redshift 

• Apply the observational mask of the DR6-NGC sample 
to decide whether the model galaxy is included in the mock 
catalogue. 



2.3 Redshift Space Distortions 

In this study we assess redshift space density maps as well 
as partially corrected "real" space density maps. 

Redshift space surveys like the SDSS are beset by dis- 
tortions in the estimated distance. The result of large co- 
herent cosmic flows, infall velocities onto clusters and highly 
nonlinear "thermal" velocities within clusters, these redshift 
distortions can have a dramatic effect on the estimated dis- 
tances and reshape the large scale matter distribution. How- 
ever for our purposes here, which is testing and comparing 
the methods, the presence or absence of fingers of God is 
immaterial. 

Hence, in this paper, which deals with the Mock SDSS 
catalogues, we do not correct for the redshift distortions in- 
duced by large scale coherent cosmic flows. In particular the 
DTFE density reconstruction is locally adaptive and so the 
reconstruction is not corrupted by the presence of elongated 
radial features; this can be seen in figure [4] 

The redshift distortions will, however, be addressed in a 
following paper in which we analyse the real SDSS catalogue. 



2.4 Magnitude- vs. Volume-limited Samples 

For the analysis of the SDSS sample, we extract two differ- 
ent samples from the full SDSS galaxy sample. These are 
a volume limited (i.e. absolute magnitude limited) sample 
and an apparent magnitude limited sample. Each sample is 
used for different aspects of our analysis. 



Volume Limited sample 

A volume lim,ited galaxy sample is defined in order to as- 
sure a uniform galaxy coverage over the survey volume. A 
volume limited sample consists of a subset of galaxies which 
are homogeneously sampled throughout the sample volume. 
It has the advantage that each sample galaxy is an equal 
weight tracer of the underlying density field, and the result- 
ing field will be statistically uniform. Our volume-limited 
sample has a distance limit of 300 /i^^Mpc and includes all 
galaxies brighter than Mr < —20.45, roughly representing 
the galaxies brighter than L*. 

While the uniformity of the volume-limited samples as- 
sures a straightforward error assessment for any analysis, it 
has the disadvantage of losing the high spatial resolution 



represented by fainter galajcies nearby, therefore it does not 
necessarily have the sm,allest error. 

Magnitude Limited sample 

The magnitude limited sample contains all 311474 SDSS 
galaxies brighter than jTir = 17.77. While a magnitude lim- 
ited sample takes along all sampled information, one needs 
to correct for the inhomogeneous selection process. 

A characteristic of magnitude-limited surveys is the 
change of intrinsic spatial resolution as we proceed out to 
larger distances. Potentially, this could be a serious issue 
when galaxies would be biased in a very complicated fash- 
ion (e.g. higher order biasing). This would render it very 
difficult to infer the density field at large distances, where 
only the most luminous objects would remain visible. By 
default, we therefore assume that all galaxies - independent 
of their luminosity- are a fair tracer of the density field. 

Following this assumption, we correct for the dilution as 
a function of survey depth by weighing each sample galaxy 
by the reciprocal w{z) of the radial selection function ipiz) 
at the distance of the galaxy. For the SDSS, the selection 
function ^{z) as a functi on of redshift z is wel l fitted by the 
expression forwarded bv Efstathiou fc Moodvl (|200H ) 



^{z) 



exp 



(1) 



where Zr is the characteristic redshift of the distribution and 
P specifies the steepness of the curve. The corresponding 
number density N[z) of galaxies at redshift z is 



N{z)dz = Az^'^{z)dz 



Az exp ^ — ( — 



dz . 



(2) 



where yl is a normalization constant and the z term repre- 
sents the increase of volume as function of z. The resulting 
galaxy weights w{z) are 



w(z) — l/tp{z) = exp < j — 



(3) 



When incorporating w{z) the weights into the density field 
reconstruction, the normalization of the resulting density 
field may be achieved by modelling the details of the selec- 
tion function and calculating the appropriate normalisation 
constant. We chose to follow the alternative of calculating 
the average of the reconstructed density field and subtract- 
ing it from the reconstruction. This is a simple and straight- 
forward procedure, and perfectly valid as long as the volume 
of the sample is large enough to enable the estimate of the 
true average. 

Mock Samples 

From the mock catalogues we select three different samples. 
The full mock catalogue comprises all Millennium (semi- 
analytical) model galaxies within the DR6-NGC mask. The 
magnitude limited mock sample includes all model galax- 
ies resulting from the procedure described above. The third 
sample is a volume limited set with an absolute magnitude 
Mr < -20.45. 

To appreciate the differences between the magnitude- 
limited and the volume-limited galaxy sample, and also the 
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Figure 5. Voronoi tessellations, Delaunay tessellations and Natural Neighbours. Left: The Dclaunay triangulation (grey lines) of a point 
set (circles). For the central point the natural neighbours arc indicated by the blue points, along with the corresponding Dclaunay edges 
(red). The hatched region is the contiguous Voronoi cell of the central point, the composite of all Delaunay triangles shared with its 
natural neighbours. Right: The Voronoi tessellation of a point sample (black diamonds). Relevant for Natural Neighbour interpolation 
(NNFE): following the insertion of a central point (black triangle), a new Voronoi cell (region enclosed by the dark blue polygonal 
boundary) is computed, the second order Voronoi cell. The gray shaded areas are the overlapping regions with the original Voronoi 
tessellation. 



full mock catalogue within the SDSS volume, the three mock 
catalogues are shown in Fig. [l] 

The contours superimposed on the corresponding 
galaxy distributions are the resulting NNFE density field 
contours (see sect. 14. 2p . 



3 LOCAL DTFE DENSITY ESTIMATE 

Throughout this study we use t he local DTFE density esti- 
mate, following the definition bv lSchaap fc van de Weveaerd 
(J2000f l. In appendix |B] one may find more details of the 
DTFE p rocedure which we followed (for an extensive re- 
view see Ivan de Weygaert fc Schaad |2008| ) . Implicitly, and 
for simplicity, we assign to each sample galaxy the same 
mass rrii, i.e. the density value is predicated on the number 
density of galaxies. 

The sample point DTFE density value is inversely pro- 
portional to the volume of of the local neighbourhood as 
defined by t he Voronoi tessellation of the sp atial galaxy 
distribution. (jSchaap fc van de Wevgaertll200Cl ) argued that 
the inverse of the volume of the contiguous Voronoi cell 
is the proper density estimate, assuring mass conservation 
for the subsequent linear interpolation step. The contigu- 
ous Voronoi cell, sometimes dubbed umbrella in the com- 
putational geometry literature, is the region defined by all 
Delaunay tetrahedra of which a given sample point is a ver- 
tex and which it shares with its natural neighbours. A two- 
dimensional illustration of a contiguous Voronoi cell of a 
point is shown as the surrounding hatched region in the 
lefthand frame of Fig. (5] 

The density value at each sample point is determined 



foll owing the construction of the Delauna y triangulation (see 
e.g. lDelaunavlll934l : lAurenhammeiJll99ll) ^ Within the tri- 
angulation we identify for each sample point i all A*'; neigh- 
bouring tetrahedra Tj (Fig. O, which together constitute 
the contiguous Voronoi cell Wi Uj Tj . Summation of the in- 
dividual tetrahedral volumes V{Tj) yields the volume of the 
contiguous Voronoi cell. 



Ni 



v{w.) = Yl ^c^) ■ 



(4) 



i=i 



For the three-dimensional SDSS sample volume, the result- 
ing DTFE estimate of the density /; at sample point i is 
(see equation IB2[I . 



/. 



V(WO 



(5) 



where the weight w{zi) is the sample selection weight at 
the galaxies' redshift Zi (equation [3]) . Note that the factor 
four takes account of the fact that in three dimensions each 
sample point belongs to four tetrahedra. In practice, the 
density of all particles is calculated by looping in sequence 
over all Delaunay tetrahedra. 

3.1 Shot noise errors 

A local density estimator such as the contiguous Voronoi 
cell has the advantage of being very sensitive to the signal. 



the Delaunay triangulation in this work 

has been computed using the CGAL library, 
IComputational Geometry Algorithms Library! ICGALJ). 
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However, this also implies them to have a high sensitivity to 
shot noise present in the data. 

For appreciating the influence of shot noise in the DTFE 
density estimates, we turn to the probability distribution of 
the estimated i ntensity A for a 3D Poisson point process with 
intensity A = l. lSchaad(|2007l ) found that it can be very well 
approximated by 

p(A)^l^A-Be(-«A). (6) 



The first observation is that the DTFE density estimator is 
unbiased, as the mean of p(A) is equal to one. An estimate 
of the error involved is that of the variance, (t| = 1/5, and is 
equal to ~ 57%. Evidently, the distribution is non-Gaussian 
with a long tail extending towards high density values (cf. 
Fig. EI). 



3.2 Centroidal Voronoi tessellations 

The high density tail of the DTFE density estimate, and the 
implied shot noise level, can be suppressed or regulari sed by 
using the centroida l Voronoi tessellation (CVT) (see iLlovdl 
1 198 J : lBrow"n3 120071 ). For a CVT the generating point dis- 
tribution is such that the generating points are the mass 
centres of the resulting Voronoi cells. 

The calculation of a CVT is usually done by means of 
an iterative procedure known as Lloyd iteration. Starting 
with an originally random point distribution, the centre of 
mass of the corresponding Voronoi cells is computed. Sub- 
sequently, the points are displaced to these centres. After a 
sequence of iteration steps, the resulting point distribution 
tends to converge to a proper CVT constellation. Effectively, 
the points have been repelling each other. 

An impression of the CVT iteration procedure can be 
obtained from Fig. [5] Involving an initial point distribution, 
the resulting intensity distribution p(A) for the tessellations 
obtained after zero, two, four, six, eight and 10 Lloyd it- 
erations is shown in the righthand frame. Clearly, a CVT 
involves a much more regular distribution: after four iter- 
ations p(A) has turned into a narrow and near symmetric 
distribution whose high-end tail is almost absent (Fig. [7|). 
Potentially, a CVT might therefore help to suppress the den- 
sity estimate error and its asymmetric distribution. 



3.3 DTFE noise: a case study 

A visual impression of the shot noise involved in the density 
estimate is provided by Fig. [8] It concerns a random sample 
of 4500 points distributed according to an anisotropic Gaus- 
sian distribution. This configuration is more representative 
for what may be expected in the real galaxy distribution. 

Densities were estimated according to the pure DTFE 
procedure (see equation [6]) and following a Lloyd CVT pro- 
cedure of 5 iterations. The top row of Fig. [8] shows the den- 
sity contours of the original peak and the raw DTFE and 
DTFE/CVT field reconstructions. The raw DTFE recon- 
struction (centre) is highly irregular compared to the origi- 
nal contours (left), while the CVT contours are much more 
regular (right). The linear profile along the y-axis (at x=0.5, 
bottom left panel) emphasizes the visual impression: the 
DTFE profile (orange) is marked by salient peaks which 
reflect the high density tail of p(A), while the DTFE/CVT 




Figure 6. Centroidal Voronoi Tessellation. The (original) 
Voronoi tessellation for the points of Fig.[5]arc plotted in orange. 
In black we show the centroidal Voronoi tessellation with corre- 
sponding points after two Lloyd iterations. The displacements are 
indicated by the gray lines. 




Figure 7. The error probability distribution of the centroidal 
Voronoi tessellation based galaxy density estimator for a homo- 
geneous Poisson sample. Arranging the curves from broadest to 
the most narrow they respectively correspond to 0, 2, 4, 6 and 10 
Lloyd iterations. 
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Figure 8. DTFE and DTFE/CVT reconstruction of an anisotropic Gaussian peak. Sampling the peak by 4500 random points, the 
top row shows the contour levels of the reconstructions: original (left), DTFE (centre) and DTFE/CVT with 5 Lloyd iterations. The 
resulting density maps after Gaussian smoothing, with Rf = 0.05, is shown in the central row. The bottom row shows linear density 
profiles through the resulting mass distribution. Left: linear profiles along the y ~ axis at x = 0.5, for the original (black), DTFE 
(orange) and DTFE/CVT (blue). Central and Right: following the same colour schamc, linear profiles through the filtered density field 
reconstruction, along the y-axis (at x = 0.5) and the x-axis (at y = 0.5). 



profile (blue) appears to adhere considerably better to the 
original profile. 

To appreciate the average trend in the density recon- 
struction, we filter the original, DTFE and DTFE/CVT 
fields with a Gaussian filter {Rf — 0.05). In addition to 
the suppressed shotnoise, the resulting density level plots, 
shown in the central row of Fig. [HI reflect the way in which 
the reconstructions affect the shape of the Gaussian peak. 



While the DTFE reconstruction retains the shap e of the 
origin al, entirely in accordance with the flndings bv lSchaad 
(|200i1 ). the DTFE/CVT appears not to do so. This impres- 
sion is confirmed by the two Unear profiles, along the y-axis, 
at a; = 0.5 and along the x-axis, at y = 0.5, shown in the 
bottom central and righthand panels. While the DTFE re- 
construction is able to follow accurately the shape of the 
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original, the DTFE/CVT reconstruction displays consider- 
able deviations. 

We may therefore conclude that while CVT appears to 
suppress the shotnoise effects on small scales, it is not able 
to follow the morphology of the mass distribution on larger 
scales. Since we will be filtering our fields in a similar way, 
this result leads us to pursue our study on the basis of the 
pure DTFE density estimate procedure without the CVT 
regularization. 



4 INTERPOLATION METHODS 

The key aspect in our investigation is the interpolation step 
of each of the three reconstruction methods. Here we de- 
scribe the interpolation steps of the investigated methods, 
the DTFE, NNFE and Kriging interpolators. 



4.1 Interpolation Method I: 

Delaunay Tessellation Field Estimator (DTFE) 

The most straightforward way to interpolate and/or re- 
construct a density field is by linear interpolation between 
neighbouring data points. The linear reconstructed field is 
continuous throughout the sample volume. Within each in- 
terpolation interval the first derivative remains constant, al- 
though it is discontinuous at the boundaries between the 
intervals. 

The Delaunay Tessellation Field Estimator 



llSchaap fc van de WevgaertI gOOd : 



■ ,— ^T— - ISchaad l2007l : 

Ivan de Wevgaert fc Schaad 120081 ) is the multidimen- 
sional equivalent of simple piecewise one-dimensional linear 
interpolation from an irregularly distributed set of points. 
DTFE generalizes the concept of natural interpolation 
interval to any dimension D by adopting the Delaunay 
tetrahedra of a multidimensional point set as such. It 
uses the adaptive and minimum triangulation properties 
of Delaunay tessellations to use them as adaptive spatial 
interpolation intervals for irregular point distributions 
IjBernardeau fc van de WevgaerdligQa ). 

Once the Delaunay tessellation has been constructed, 
and the densities at each sample point determined (see 
sect. [H]), we determine the density gradient V/ within 

each Delaunay tetrahedron Tj from the density values 
(/(ro), /(ri), /(r2), /(ra)) at its four vertices at location 
ro,ri,r2,r3. 

Using the density gradients in the Delaunay tetrahedra, 
the DTFE density value at any point r can be calculated by 
determining in which tetrahedron it is located and subse- 
quently computing its density estimate fir) from the linear 
equation. 



/(?) = f{ro) + V/ -(f-ro) 



(7) 



To obtain an image of the density field, one calculates these 
density estimates at each of the voxel locations of the image 
grid. For a more detailed outline of the DTFE method we 
refer to section [B] 

An impression of a DTFE interpolated field in a cos- 
mological context is presented in Figure [9] (top righthand 
panel). It concerns a density field reconstruction from a 



dataset extracted from a Millennium mock sample (top left- 
hand panel). The galaxy selection follows the distant ob- 
server approximation, i.e. following parallel lines of sight, 
and assumes the magnitude limit ni,- = 17.77 of the SDSS 
redshift survey. 

The resulting DTFE density field is the level map in 
the top righthand panel. DTFE recovers the fine small-scale 
structures and at the same time adapts itself to the larger 
scale structures at greater distances. It also reveals the linear 
interpolation artifacts, the triangular shaped low-intensity 
wings. These are especially noticeable when the data points 
are sparse. We must note that these wings are not significant 
in mass, but arise when one takes a lower dimensional (1 or 
2) section through the data. 



4.2 Interpolation Method II: 

Natural Neighbour Field Estimator (NNFE) 

The DTFE is a piecewise linear interpolation (C*^) method. 
In a sense it is a linear version of a larger class of 
tessellation based interpolatio n methods. Of these, Nat- 
ural Neighbour interpolat ion (|Sibsonl [l98ll : IWatsonI Il992l : 
iBraun fc Sambridgel 1 19951 ) is the most well known higher 
order tessellation based metho d (for more details see 
Ivan de Wevgaert fc SdiaaolbOOSl ). 

The Natural Neighbour Interpolation formalism is 
a generic higher-order multidimensional interpolation, 
smoothing and modelling procedure utilizing the concept 
of natural neighbours to obtain locally optimized measures 
of system characteristi cs. Its theore tical basis was devel- 
oped and introduced bv lSibsonI (|l981f ) , while extensive treat- 
ments and ela b orations o f nn-in terpolation may be found in 
IWatsonI (|l992l '): ISukumail (119981 ). As ha s been demonstrated 
by t elling examples in g eophysics (Braun &: Sambridge 
1995 ) and solid mechanics (jSukumar et al.l 1 19981 : ISukumar 
19981 1 NN methods hold tremendous potential for grid- 



independent analysis and computations. 

According to the Sibson natural neighbour interpola- 
tion, the interpolated value /(r) at a position r is given 
by 



/(^) = X] ^"".4^)/i> 



(8) 



in which the summation is over the natural neighbours i of 
the point r amongst the data points (see Fig. [5l righthand 
frame). Note that a slight movement of the interpolation 
point will evoke a different set of natural neighbours. 

The Sibson natural neighbour interpolation uses area- 
based (or volume in 3D) interpolation weights A„„,i(r). 
These are determined from the volumes of the order-2 
Voronoi cells V2(?, r^). To understand the concept, imag- 
ine we virtually insert the location r in the spatial sample 
point distribution. Around its location a new Voronoi cell 
V(r') is delimited (see Fig.O where the cell is traced by the 
blue edges). The virtual cell V(r) overlaps with the origi- 
nal Voronoi cells Vj of its natural neighbours j. The order-2 
Voronoi cells V2{r,rj) are the regions of overlap, and de- 
fine the region of space for whom r and r^ are the closest 
"nuclei" . 

According to Sibson interpolation the interpolation ker- 
nel Annj(?) is equal to the normalized area of the order-2 
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Voronoi cell, 



^nn,i\f^) — 



A2(f,ri) 



(9) 



in which A{f) = "^^ A2{r,rj) is the area of the virtual 
Voronoi cell of point f and .42 (r, r^) the area of the order-2 
Voronoi cell V2{r, r,). 

Evidently, the closer one moves a point r to a sample 
point Vj, the more Voronoi cell V(r) will overlap with the 
original Voronoi cell Vj, and the larger the volume A2{r, r^) 
of the order-2 Voronoi cell V2{r, Vi) becomes, thus increasing 
the weight A„„,i(r). When f finally coincides with one of the 
natural neighbours, the order-2 Voronoi cell will be identical 
to the old Voronoi cell at this point. The interpolated field 
value /(r) will then be equal to the field value at that point. 

Notice that the interpolation weights A„„,i are always 
positive and sum to one. 



^ A„„,i(r) = 1. 



(10) 



This property is called partition of unity. The resulting func- 
tion /(r) is continuous everywhere within the convex hull 
of the data, and has a continuous slope everywhere except 
at the data themselves. At the position of the vertices the 
derivative of the interpolant is discontinuous. 

In one dimension DTFE and NNFE are exactly the 
same. When the data-points are given on a regular grid, the 
NNFE reduces to the more familiar bi-linear (2d) or trilin- 
ear (3d) i nterpolation schemes. Our NNFE implementation 
is that of lEldering et al.l (|200a), a three-dimensional adap- 
tion of the two-dimensional version available in the CGAL 
library. 

The NNFE density field reconstruction for the same 
Millennium mock sample as described in section 14.11 is 
shown in the bottom lefthand panel of Fig. [O] Some of the 
peaks in the regions with sparse sampling appear some- 
what anisotropic. This is a consequence of the discontinuous 
derivative at the sample point. The overall resulting NNFE 
density field is well-behave and smooth, without the artifacts 
that beset the DTFE reconstruction. 

A drawback of the DTFE and the NNFE methods is 
that neither take into account the existing spatial correla- 
tions that characterize the cosmological density field. These 
are explicitly taken into account by the Kriging interpola- 
tion technique. 



4.3 Interpolation Method III: 
Kriging Interpolation 

By basing itself on the covariance function of the density 
field, Kriging naturally includes the global spatial correla- 
tions of the field. 

The method was named by iMatheronI ()l963l ) after D. 
G. Krige, wh o started the development of the method (see 



ICressid llQQCJ . for a historical overview) . The interpolator 
has the property that it is a best linear unbiased estima- 
tor (|Cressielll988l . ll993l V Most applications of Kriging stem 



from the field of geostatistics, where Kriging found its origin. 
The applications concern measurements at irregularly scat- 
tered points which have to be translated into, for example, 
gold, ore or oil field reconstructions or into altitude maps. 

There is a distant relationship of Kriging inter- 



polation to Wiener f iltering techn i ques i Wiene: 



Rvbicki fc Presj Il992l: IZa,roubi et all Il995l: 



Zaroub: 



I 



194E : 



20021 : 



Erdogdu et all |2004| . l2006l : iKitaura fc EnfilinI |2008| ). How 

ever, Wiener filtering is based on a different philosophy than 
Kriging, in that it includes a model for the noise and is eval- 
uated in Fourier space. The retrieved field therefore corre- 
sponds to an optimally filtered field over a range of unknown 
scales. The filter scale is dictated by the locally estimated 
noise, with more noise corresponding to a larger amount of 
smoothing. An additional disadvantage for recovering non- 
linear weblike features of classical Wiener filtering is that it 
is predicated on an underlying Gaussian distribution in the 
construction of the least squares estimator for filtering the 
data. While advantageous for the purpose of ascertaining the 
exclusive presence of significant features, it therefore has the 
serious disadvantage of suppressing or substantially diluting 
nonlinear structures of interest and may have difficulties in 
reconstructing the intricacies of the nonlinear structures. 

More advanced recent developments and applications of 
Wiener filters to the reconstruction of the density distribu- 
tion have revived its potential for reconstructing the density 
distribution. For an exten sive and i n-dopth overview of these 
developments we refer tolKitaura fc EriBlin ( 20Q.^ ) (also see 
iKitaura et a"Lll2009l . [201ol ). 



4-3.1 The Kriging formalism 

Interpolation can be viewed as estimating the field value / 
at location r by means of a weighted linear combination of 
nearby known data points f{ri); 



fir) = E^'/(^» 



(11) 



The main idea of Kriging, as originally formulated, is to cal- 
culate the values of weights A; that minimise the error with 
respect to the data according to the mean square variation, 

E(|./(r) - /(r)|2), 

where E is the expec t ation over the specified quantity. We 
show in Ijones et al.l (|201l|) that this criterion can be re- 
placed with the weaker requirement that the data and the 
errors be orthogonal in a statistical sense. It follows that the 
statistical distribution of the field f{r) need not be Gaus- 
sian distributed in order to achieve optimal reconstruction 
of the density field via Kriging. 
The Kriging equations for the weights Aj are 

N 

^C(r„r,)A, =c(r„f), (12) 

where the matrix elements of C are given by 

C(n,r,) =E(/(rO/(r,)), (13) 

and the vector elements c{ri,r) by 

c(r„f) =E(/(r,)/(f)). (14) 
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Figure 9. Density Reconstructions of SDSS mock catalogue. Top left: the Millennium galaxy mock sample, following the distant observer 
approximation. Top right: DTFE density field reconstruction. Bottom left: NNFE density field reconstruction. Bottom right: lognormal 
Kriging reconstruction. 



From this linear system of A'^ equations, it is straightforward 
to determine the A'^ unknown weights \i. 

While the matrix has to be inverted only once, the 
weights \i have to be specifically computed for each inter- 
polation site r. After the weights have been determined, one 
can directly obtain the interpolated field values / from equa- 
tion (fTTj) . 



4.3.2 The Kriging Variogram 

Usually, the covariance function E(/(ri)/(r})) depends only 
on distance, d — \ri — rj\. In geostatistics, this spatial de- 
pendence of random field is usually characterised by means 
of a variogram 7(ri,r2). The variogram is the mean square 



variation of the field values as function of distance, 

27(ri,r2) = E(|/(ri) ^ /(ra)]") , (15) 

which for a stationary random field reduces to 

27(h) =E(|/(r)-/(r + h)l2). (16) 

The variogram is related to the covariance function c{h), 
c{h) = E(/(ri)/(r2)) 

= c(0)-7(/i). (17) 



For practical purposes it is preferable to use a functional 
form for th e variogram. There is variety of such variogram 
models (see lCressielll993l . for a detailed description). We use 
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h [Mpc/h] 



Figure 10. The measured variogram from top to bottom cor- 
respond to the unsmoothed field (red), and the filtered fields at 
scales of 1 /i~^Mpc Gaussian filtered, 3 h~^Mpc, 6 /i~^Mpc and 
10 /i~^Mpc (black). The fitted variogram models according to 
equation II18D arc shown in orange. 



Kriging" procedure uses the logarithm of the density field 
value to transform the density field data, 



0(/) = log(l + /) 



(21) 



Since the density p — p{l + f) is always positive, the ap- 
plication of the lognormal approach is valid everywhere and 
guarantees a positive definite reconstruction of the density. 
The final interpolation values are obtained by taking 
the inverse transformation, ie. the exponential of the inter- 
polated data values. 



fir) 



exp|f]Aaog(/(rO) 

U=i 



(22) 



We will use the logarithmic value of the DTFE-interpolated 
field in what follows. 

For such a field, figure [10] shows the variogram for 
a DTFE interpolated field (red) and for fields Gaussian 
smoothed on scales Rf = 1, 3, 6, 10 /i~^Mpc. The fit- 
ted variogram model parameters (equation [18]) are listed in 
table [T] 

In the second paper of this series on SDSS density field 
reconstructions, we will demonstrate that the galaxy distri- 
bution is indeed very well modelled by a lognormal distri- 
bution at scales in excess of 3 h~^Mpc. 



Table 1. Kriging Variogram Parameters. Parameters obtained 
from simulated Millennium SDSS mock catalogue (see text). 
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the exponential expression. 
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(18) 



which represents a good fit for the variogram measured from 
a a Millennium SDSS mock galaxy survey (see sect. 12. 2| ). To 
estimate the v ariogram for t his galaxy distribution, we used 
the estimator (|Cressig|l993l ). 



7(ft) 



iV, 



1 ™p , 



(19) 



We base our estimate on a large number Np of randomly 
chosen locations within the sample volume. 



4-3.3 Lognormal density fields 

A lognormally distributed density field / has distribution 



PlnU) 



y/2^ 



exp 



[log(l + /) + gV2] 
2S2 



1 



1 + /' 
(20) 
where S is the S = log( l -I- a) and a^ = (f^) is the variance 
of the density field, (see lColes fc Joneslll99lh . 

What might simply be referred to as the "Lognormal 



4-3.4 Localized Kriging 

The value of A'^ to be used in equation (|ll|l has so far not 
been defined. We chose the local neighbourhood to be the 
tetrahedral natural neighbourhood. In 3 dimensions this is 
the union of all natural neighbours of the four vertices of 
the Delaunay tetrahedron in which a point r is located. This 
choice exploits the self-adaptivity to density and local shape 
of the Delaunay triangulation, and does not suffer the ad- 
verse effects mentioned above for the options of distance or 
number of n eighbour selection (also cf. the d iscussions in 
ISchaap 2007l : lvan de Weygaert fc Schaap||2008l ). Our exper- 
iments indeed confirm that the tetrahedral natural neigh- 
bourhood choice is superior to that of the 2 options listed 
above. 

We found that in 3 dimensions the tetrahedral natural 
neighbourhood on average contains approximately 57 parti- 
cles. One may extend the neighbourhood by adding a third 
or even more layers around it. An additional third layer 
would involve an average of 284 neighbours: however, the 
overall quality of the field reconstruction is not significantly 
better than with two lay ers despite a substa ntial increase in 
computational effort. See jjones et al.l (|201ll ) for more details 
on this. 

Thus in our key equation (|22|l we use the value of N that 
is the number of first and second layer vertices surrounding 
each point. 



5 QUALITATIVE DENSITY COMPARISON 

For an assessment of the performance of the three recon- 
struction techniques, we turn to the mock catalogues mod- 
elling the SDSS DR6 galaxy survey sample. On the basis of 
the knowledge of the underlying density field of the mock 
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samples, we will be able to infer absolute statements about 
the quality of the reconstructions. 

In this section will first address the visual appearance of 
the density field reconstructions, which will allow a qualita- 
tive and global judgement on their ability to reproduce the 
true density field. A quantitative error and correlation anal- 
ysis will follow in the subsequent section [S] while the topo- 
logical properties of the reconstructions will be discussed in 
sect. [7] 



5.1 Maps of the density field 

Density contour maps for the three reconstructions are 
shown in Fig. [TT] and Fig. 1121 The maps concern thin slices 
through the density field reconstructions (and thus have zero 
intrinsic width). 

The first set of three maps concern the reconstruction 
on the basis of the full Millennium mock galaxy sample 
within the boundaries of the SDSS DR6 North Galactic Cap 
region. The second set concerns the same region, but for the 
volume-limited mock sample. 

To distinguish underdense and overdense regions, the 
underdense regions are indicated by means of a colour con- 
tour map while the overdense regions are represented by 
black contours. The contour levels in all maps are the same. 

The DTFE, NNFE and Kriging maps all display the 
same global structure. The overall appearance of both the 
DTFE, NNFE and Kriging density maps is strikingly sim- 
ilar. In the higher density regions the differences are mi- 
nor, and mainly concern the coherence and anisotropy of the 
reconstructed filamentary (and sheetlike) features. Qualita- 
tively speaking, the lower density regions do reveal more 
differences between the three methods. 

As expected, the volume-limited maps fail in reproduc- 
ing the small-scale structure, while they trace the overall 
weblike outline seen in the full sample maps. 



5.1.1 DTFE map 

The DTFE map is remarkably accurate in outlining the ten- 
uous weblike filamentary features, in particular in the case 
of the full sample reconstruction. Amongst the three recon- 
structions, the DTFE one looks more crispy than the NNFE 
and Kriging maps. It is slightly more capable in tracing the 
thin filamentary and sheetlike features, while one might have 
a slight worry with respect to the correct reproduction by 
NNFE and Kriging of the shape of filaments and walls. Also, 
we find that the DTFE maps are clearly marked by higher 
density contrasts, both within the overdense regions as well 
as with respect to the underdense regions. 

The downside of the detailed structural reconstruction 
of DTFE is the more erratic nature of the DTFE contours, 
marked by sharp artifacts. These artifacts are particularly 
prominent in the field reconstruction of the volume-limited 
sample. They are manifestations of the linear interpolation 
method: when two neighbouring grid cells are located in 
different triangles, the field would appear to be discontinu- 
ous. The latter is mostly an impression, as sampling at finer 
scales would show it is just continuous. These artifacts occur 
in situations where the point sample density is considerably 
sparser than the size of the gridcells. This is also the reason 



why these artifacts are more prominent in the DTFE recon- 
struction of the volume-limited sample than in that of the 
full sample. 



5.1.2 NNFE and Krigmg 

The reconstructed density maps of the two higher order 
schemes, NNFE and Kriging, have a considerably smoother 
appearance than the DTFE maps. The larger number of 
neighbours involved in the NNFE and Kriging reconstruc- 
tions translate into the slightly more roundish contours of 
these structures. This also reveals itself in the absence of 
artifacts such as seen in the DTFE maps. 

Part of the differences between the smoother higher or- 
der maps and the DTFE maps is an expression of the number 
of points, and field values, involved in the interpolation step 
(cf. eg. equation I22[). DTFE uses 4 points, the vertices of a 
Delaunay tetrahedron. NNFE involves on average 17 natu- 
ral neighbours, while the Natural Kriging scheme invokes 57 
neighbours. 

The smoother nature of the NNFE and Kriging maps is 
therefore an expression of the somewhat lower information 
content of these filtered maps. As a result, they also have 
less noisy low density regions than those seen in the DTFE 
maps. 



5.1.3 Anisotropic Structure and Features 

One of the crucial benefits of DTFE is that it is able to iden- 
tify anisotropic features, like walls and filaments, a nd suc- 
cessfully reproduce their shape and mo rphology (see lSchaad 
l2007l : Ivan de Wevgaert fc SchaanlbOOSl ') . 

From the density maps we see that NNFE and Kriging 
find the same filamentary structures. Overall, the impression 
is that DTFE and NNFE produce maps in which the cosmic 
web is more coherent than in the Kriging maps: the Kriging 
map mass concentrations have a slight tendency to break up 
more easily into clumps. This is true for both the full sample 
maps as well as the volume-limited sample maps. One ex- 
ception, in the volume-limited map, seems to be the filamen- 
tary extension running from (X, Y) = (170, 160) /i^^Mpc to 
(X,Y) = (300,225) /i"^Mpc. One reason for the somewhat 
more fragmentary nature of the Kriging maps is its use of 
a radially symmetric covariance function, while DTFE and 
NNFE are based on kernels that adapt to the local shape. 



5.1.4 Underdensities & Voids 

When turning to the underdense regions, we find that in the 
full density map both DTFE and Kriging delineate them at 
high contrast levels. The voids in the NNFE map the voids 
have a lower contrast. This is partially the result of the larger 
NNFE neighbourhood radius in low density areas. In this re- 
spect, the Kriging correlations assure a better performance. 
DTFE remains sensitive on behalf of its highly local char- 
acter. 

A comparison between the void population in the full 
sample reconstruction and that in the volume-limited sam- 
ple reconstruction reveals a considerable contrast between 
the results for the different reconstruction methods. None of 
the volume-limited reconstructions contain the small voids 
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Figure 11. Comparison SDSS-DR6 reconstructions for the magnitude-limited galaxy sample. Shown are - zero width - sections through 
the density field reconstructions. Top: DTFE; Centre: NNFE; Bottom: Natural Lognormal Kriging. The coloured contour levels repre- 
sent the underdense regions, at p/p^ = [0.001,0.002,0.005,0.01,0.02,0.03,0.05,0.07,0.1,0.2,0.3,0.6,0.7,0.8,1]. The white areas are the 
overdense regions and the black contour lines represent a density contrast p/pu = [l-,3., 10.]. 
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Figure 12. Comparison SDSS-DR6 reconstructions for the volume-limited galaxy sample. Shown are - zero width - sections through the 
density field reconstructions. Top: DTFE; Centre: NNFE; Bottom: Natural Lognormal Kriging. The coloured contour levels represent the 
underdonse regions, at p/p^ = [0.001,0.002,0.005,0.01,0.02,0.03,0.05,0.07,0.1,0.2,0.3,0.6,0.7,0.8,1]. The white areas are the overdense 
regions and the black contour lines represent a density contrast p/ Pu = [l-i3., 10.]. 
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Figure 13. Linear profiles along line of sight. All profiles are taken along the Y-axis, at {X,Z) = (300,300) h^^Mpc. Top: DTFE 
density field reconstruction; Centre: NNFE density field reconstruction; Bottom: Natural Lognormal Kriging reconstruction. In each 
panel: black line: density field reconstruction full galaxy sample; red lines: density field reconstruction volume-limited galaxy sample; 
gray lines: Gaussian filtered density field reconstruction full galaxy sample, Rf = 10.0 h^^Mpc; orange lines: Gaussian filtered density 
field reconstruction volume-limited sample. 



visible in the full sample maps. This is a reflection of the ab- 
sence of such depressions in the diluted point sample. Of the 
three reconstructions, the contrast between the two DTFE 
maps is less distinct than that between the two NNFE and 
two Kriging maps. DTFE at least manages to trace the 
large voids at {X,Y) = (180,140) /i~^Mpc, at iX,Y) = 
(350,200) h-'^Mpc and at {X,Y) = (460,120) ft-^Mpc. 
Kriging and NNFE hardly manage to find the latter in 
the volume-limited maps, while the huge void complex near 
{X,Y) = (350,200) h~^Mpc is a largely uniform moderate 
underdensity in the Kriging map. Only the DTFE map re- 



veals its true nature, a region marked by several deep voids 
embedded in a larger moderate undensity. 



5.2 Density Profiles 

To appreciate the small-scale details in the density field re- 
constructions, figure [13] displays a linear proflle, along a 
radial distance of 500 h~^Mpc, through the density fleld 
reconstructions. All profiles are taken along the Y-axis, at 
{X,Z) = (300,300) h-^Mpc. Also cf Fig.[Tl]and Fig.[ll 
The panels, from top to bottom, concern the DTFE, 
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Figure 14. Correlation diagrams for DTFE density field reconstruction. Plotted are the value of the density reconstruction from the 
full galaxy catalogue, Sf^u + 1 (abscissa), against the density reconstruction from the magnitude-limited mock catalogues. Smock + 1 
(ordinate). The colours indicate the number of voxels that occupy the corresponding position in the correlation diagram, with white 
indicating the density pairs with highest occurrence and subsequent contour levels at logarithmic spacings of ~ 2.5 (see text). Top left: 
unsmoothed density field reconstructions, Rf = 0.0 h~^Mpc. Top right: Rf = 1.0 h~^Mpc. Bottom left: R.f = 6.0 h~^Mpc. Bottom 
right: Rf = 10.0 h~^WLpc. Note that the apparent offset of the small scale panels is a result of cosmic variance introduced by the locally 
estimated density field normalization. 



NNFE and Kriging reconstructions. The black lines are lin- 
ear profiles through the full sample reconstructions, the su- 
perimposed red profiles concern the volume-limited mock 
sample reconstructions. We also show the linear profiles 
through these density field reconstructions, Gaussian filtered 
on a scale of Rf = 10 h~^Mpc. The gray lines are the linear 
profiles through the filtered full sample density field, the or- 
ange lines those through the filtered volume-limited sample 
density field. 

The comparison between the linear profiles through the 
NNFE and Kriging reconstructions on the one hand, and 



the DTFE reconstruction on the other hand, leads to the 
following observation: 

• Underdense regions of NNFE and Kriging are less noisy. 
A nice example of this is the underdense region at X = 
200 ft-^Mpc. 

• Maxima tend to be wider in the higher order schemes. 

• At larger radial distance, the topology of the unfiltered 
reconstructions is much smoother. 

In the cosmological context, the d ensity is ap- 
proximately lognormally distributed ( see IColes fc Joned 



Il99ll : iKofman et all 1 19941 : IShethI 1 19951 : iKavo et al.l I2001I : 
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Figure 15. Correlation diagrams for DTFE, NNFE and Natural Lognormal Kriging density field reconstructions. Plotted are the value 
of the density reconstruction from the full galaxy catalogue, ^j^jj + 1 (abscissa), against the density reconstruction from the magnitude- 
limited mock catalogues, S^^ck + 1 (ordinate). The density values are those for the field filtered on a scale Rf = 3.0 /i~^Mpc. The colours 
indicate the number of voxels that occupy the corresponding position in the correlation diagram, with wrhite indicating the density pairs 
with highest occurrence and subsequent contour levels at logarithmic spacings of ~ 2.5 (see text). Left: DTFE reconstruction; Centre: 
NNFE reconstruction; Right: Natural Lognormal Kriging. 



iKitaura et al.l [2010|), which impels us to work the log 
of the inferred density field. Our adaption of the Krig- 
ing method to tesselations of lognormally distributed dis- 
crete point_data is pr esented in detail in a separate pa- 
per (| Jones et al.l |201l|), where we also discuss the rela- 
tionship of this procedure with the Constra i ned R andom 
Fiel d formalism deve l oped by Bertschingen (119871 ) (also 
Hoffman fc Ribald Il99ll: iRvbicki fc Presd Il992l: IShethI 



1 19951 : Ivan de Weygaert fc Bertschingeijl 19961 ). We adopt the 



nomenclature used in the geostatistical literature. 



6 QUANTITATIVE DENSITY FIELD 
ANALYSIS 

For the quantitative comparison and error evaluation of the 
three density field reconstruction methods, we are particu- 
larly interested in the ability to recover the underlying den- 
sity field from a magnitude- limited or volume-limited survey. 

In this section we compare the DTFE, NNFE and 
Natural Lognormal Kriging reconstructions on the basis 
of magnitude-limited or volume-limited mock samples with 
that of the corresponding density fields determined from the 
full galaxy samples. The latter are galaxy redshift space sam- 
ples that we would hypothetically obtain if we were able to 
observe all galaxies in the Millennium mock sample. The 
magnitude- and volume-limited samples are obtained from 
this Millennium mock sample by imposing the observational 
specifications of the SDSS survey (see sect. 12. l| ). An impres- 
sion of the differences in structural resolution is provided by 
the visual comparison of the NNFE full, magnitude-limited 
and volume-limited sample reconstructions at the beginning 
of this study, in Fig. [4] 

We will restrict the error analysis to the redshift space 
density maps, and not assess the errors introduced by pecu- 
liar velocity distortions. These are investigated in detail in 
the follow-up paper. 

Most of our analysis focuses on the quality of the den- 
sity field reconstructions obtained from magnitude-limited 
samples, unless specifically stated otherwise. 



6.1 Magnitude-limited survey reconstructions: 
Correlation Diagrams 

The first comparison between the magnitude-limited sam- 
ple density field reconstruction and the full sample density 
field concerns a purely local point-to-point comparison. This 
test involves an inspection of the correlation diagrams of the 
density field value 5fuii{v) of the full sample density field at 
location versus the corresponding density value 5mock{v). 

Since this is a strictly local comparison, and does not 
distinguish between systematic nonlocal offsets or random 
field fluctuations, we try to get an impression of environ- 
mental effects by simply studying a sequence of filtered den- 
sity fields. Of each density field we study four Gaussian 
filtered versions, at filter radii of Rf — 0.0, 1.0, 6.0 and 
10.0 h~^Mpc. These scales represent the transition from the 
non-linearity (1 h~^Mpc) to quasi- linear and linear scales at 
10.0 h-^Mpc. 

If the survey-based reconstruction were perfect, the cor- 
relation diagram should be a perfect one-to-one line. The 
correlation diagrams show the level of scatter, and reveal 
whether the reconstruction errors are dependent on density 
and as well expose the presence of systematic offsets. 



6.1.1 DTFE correlation diagrams 

Fig. [2] presents the correlation diagrams for the 
DTFE reconstructed fields at four filter scales, Rf = 
0.0,1.0,6.0,10.0 ft-^Mpc. 

Instead of a pure scatter diagram, we depict the density 
of the pairs (Sfuii, Smock) by means of contour levels. The 
contour levels in Fig. [T?] depict the number density of pairs 
in logarithmic bins of size A\og{l + 5fuu) x A\og{l+6fuii) = 
(0.02 X 0.02) . The levels run from one pixel per bin (black) to 
the maximum number density (white), in logarithmic steps 
of ~ 2.5. In each frame the black diagonal line indicates the 
exact one-to-one relation between full sample density field 
and the density field following from the magnitude limited 
sample. 

For all filter radii, we find that over two orders of mag- 
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nitude the correlation diagrams centre around a strict lin- 
ear one-to-one relation. For the smaller filter radii [Rj = 
0.0 h~ Mpc and Rf = 1.0 h~ Mpc) the relation appears to 
be slightly offset. This is a reflection of the necessarily small 
volumes probed by the galaxy survey at distances close to 
the observer, introducing cosmic variance eifects through the 
locally estimated density normalization (see section [6731) . 

Evidently, the scatter around the linear 1-1 relation 
decreases as the fllter radius increases. For both Rf = 
0.0 /i^'^Mpc and Rf — 1.0 /i^^Mpc, we find a more sub- 
stantial scatter in the low density regions compared to the 
higher density areas. The scatter over the full density range 
turns into a more and more uniform level as we go to larger 
filter radii. Indeed, for Rf > 3.0 h'^Mpc and 0.1 < 5 < 1 
the agreement is very good. 

A related additional trend is that from a slight upturn 
at low density values for Rf — 1.0 h^^Mpc towards a slight 
downturn at larger filter radii. In other words, at larger fil- 
ter radii the reconstructions seem to have a systematic bias 
towards more underdense values. 

However, we need to be careful in drawing general con- 
clusions with respect to the low and high density extremes. 
In particular, for the large filter radii, these tend to be dom- 
inated by only a few rare objects. Because the offsets are 
relatively minor, we do not consider it a serious problem. 



6.1.2 DTFE, NNFE and Kriging correlation diagrams 

To compare the performance of the three methods. Fig. [TS] 
shows the correlation diagrams for the density field recon- 
structions at a filter scale of Rf = 3.0 /i^^Mpc. Each of the 
reconstructions shows an almost perfect one-to-one relation: 
all three methods yield unbiased reconstructions. 

We cannot detect any large differences between the 
methods. This suggests that the remaining deviations of the 
reconstructed density fields are due to the initial DTFE den- 
sity estimate. Again, for Rf > 3.0 h^^Mpc and 0.1 < 5 < 1 
the agreement is very good. 

Alternative density estimators might lead to further im- 
provements. 



6.2 Magnitude-limited survey reconstructions: 
Intrinsic Smoothing Scale 

A characteristic of the magnitude-limited survey is the 
change of intrinsic spatial resolution as we proceed out to 
larger distances and the galaxy sample includes only the 
most luminous objects. While one may correct the density 
estimates for the accompanying dilution (see sec.O, the loss 
of small scale resolution and the accompanying geometric 
resolution is impossible to correct. This affects any study of 
filaments and walls on the basis of the reconstructed density 
field maps. 

The correction for this dilution effect is complicated by 
another effect, the intrinsic density-dependent resolution of 
the sampled galaxy surveys. Higher density regions are sam- 
pled by more galaxies, and are therefore more resolved than 
the low density void regions. 

We may evaluate the intrinsic resolution of the density 
maps in terms of the intrinsic smoothing scale Ri„t. Effec- 
tively, it is related to the characteristic galaxy separation at 
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Figure 16. Intrinsic Smoothing Scale and Density. Blacli con- 
tours: the correlation diagram between the Rf = 3.0 h^^Mpc 
filtered DTFE density field of the full galaxy sample (abscissa) 
and the unsmoothed DTFE density field reconstruction on the 
basis of the magnitude-limited survey (ordinate). Superimposed 
on the correlation diagram between the unfiltcrcd full sample den- 
sity field (abscissa) and the unsmoothed magnitude-limited mock 
sample density field (ordinate). The colours indicate the number 
of voxels that occupy the corresponding position in the correla- 
tion diagram, with white indicating the density pairs with highest 
occurrence and subsequent contour levels at logarithmic spacings 
of ~ 2.5. 



a given redshift and position. It is rather straightforward to 
incorporate the effect of the increasing dilution as a function 
of redshift z, on the basis of the radial selection function (/'(z) 
(see equation[T]). However, the considerable density contrasts 
in the inhomogeneous matter distribution render if far from 
trivial to find a correct expression for Rint{z). The mean 
galaxy separation is biased to high density regions and seri- 
ously underestimates the correct value Rint{z): the intrinsic 
smoothing scale is smaller than average in higher density 
areas, and larger in low density regions. 



6.2.1 Intrinsic Smoothing Scale: Correlation Diagram 

To appreciate the effect and the dependence of intrinsic 
smoothing length on density, we evaluate the correlation 
between the filtered (DTFE) density field of the full galaxy 
sample and the DTFE density field reconstruction on the 
basis of the magnitude-limited sample. 

In particular, we are interested in the question in how 
far the filtered density field reconstructions are representa- 
tive. We may assume they are as long as the corresponding 
filter scale Rf > Rint(z). By evaluating the filtered density 
fields we may obtain an idea of the intrinsic smoothing scale 
on the basis of the density field error analysis: a sudden rise 
of error would indicate that Rint{z) > Rf. 

By means of the black contours in Fig. [TS] we show the 
correlation diagram of the full sample density field filtered 
at Rf = 3.0 ^~^Mpc versus the unfiltered density field on 
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the basis of the magnitude-hmited galaxy sample. It is su- 
perimposed on the (colour) correlation diagram between the 
unfiltered full sample density field and the unfiltered density 
field reconstruction from the magnitude-limited mock sam- 
ple. While the latter has substantial scatter over the full 
density range, the correlation between the filtered field and 
the magnitude-limited field appears to be much tighter. It 
confirms the impression that reconstructed fields resemble 
density fields that are filtered on a particular scale. The 
reconstructed density maps in Fig. [11] and Fig. [12] form a 
telling illustration of this observation. 

The most outstanding feature of the filtered field cor- 
relation diagram is the curved nature of the contours, quite 
different from a regular linear relation. In the higher density 
regions we find that the densities in the magnitude-limited 
sample reconstruction are systematically biased to higher 
values. In the low to moderate density areas the trend is 
reverse: the densities of the raw survey density field tend to 
be somewhat lower than in the filtered field. This is a direct 
illustration of the density dependent nature of the intrinsic 
smoothing length Rint{z). Because the effective smoothing 
length is small in high density regions the low density re- 
gions get relatively over-smoothed and so the density is rel- 
atively over-estimated. More formally, it is an expression of 
the greater information content in the higher density regions 
of galaxy redshift surveys. 



6.2.2 Filtered density field reconstructions 

The filtering of a raw reconstructed density field will smooth 
out the high density values. The corresponding mass is re- 
distributed to lower density regions. We argue that this can 
only yield correct density distributions if the nonlinear fea- 
tures in the mass distribution were reproduced at the correct 
position and with the correct amplitude in the raw sample 
density field reconstruction. Information from high density 
features and nonlinear objects is crucial for obtaining the 
correct large scale density field. 



6.3 Radial Error Analysis 

To quantify the reconstruction errors, we compare the local 
density values f{f)^^^f, of the magnitude-limited survey re- 
construction with that of the density /(r) of the full sample. 
We evaluate the absolute error ei{r), 



^i(r-) = l/W-/(r)„_,], 
as well as the relative error £2(1^), 

[fir) - f{r)^,,k\ 
€2{r) = ^ 



fir) 



(23) 



(24) 



6.3.1 Error Profiles 



In figure [T71 we plot the absolute errors e(r) along the same 
radial direction as in Fig. 1131 The corresponding DTFE den- 
sity profile is plotted in the top panel, while the subsequent 
three panels depict the ei profiles for the DTFE, NNFE and 
Natural Lognormal Kriging density fields. 

The green lines are the error profiles of the unfiltered 
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Figure 17. Linear Error profiles. The absolute error ei (equa- 
tion l23l l as a function of distance, along the same axis as the den- 
sity profiles in Fig. 1131 Upper panel: the density profile through 
the DTFE density field (see Fig. 1131 for description). The subse- 
quent frames are the error profiles for the three reconstruction 
methods. The green line represents the error of the unsmoothed 
density field, the purple line that of the 10.0 /i~^Mpc filtered field. 
2nd frame from top: DTFE density field; 3rd frame from top: 
NNFE density field; bottom: Natural Lognormal Kriging density 
field. 



density field (black line in the upper panel), the purple lines 
are the error profiles of the 10 /i~^Mpc smoothed field (gray 
line in the upper panel). We find that the errors of all three 
reconstruction techniques are fairly similar, with a slightly 
increasing trend with distance for the unfiltered density field. 
At a distance of approximately 100 h~^Mpc it is of the order 
of unity. Beyond this distance the errors are characterised 
by wide peaks that are mainly due to the magnitude- limited 
surveys undersampling of the density field at large distances. 
The errors in the 10 /i^^Mpc filtered field remain small 
over the whole reach of the linear profile, averaging an er- 
ror level of around 10 percent. At intermediate distances, 
200 h-^Mpc < i? < 400 h^'^Mpc, the DTFE errors are 
somewhat lower than those of the other methods. For all 
three methods, the largest errors are found at close dis- 
tances. This is attributed to the survey mask, which is rela- 
tively thin in our immediate cosmic environment, ie. at dis- 
tances R < 100 h~^Mpc. Note that the filtered field errors at 
large distances appear to converge to the error profile of the 
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Figure 18. Radially averaged error profiles. Top: tlie absolute 
error profile t\{r) (solid line) and relative error t2{r) (dashed 
line) for the magnitude-limited survey DTFE density field recon- 
struction. Going from dark to light blue, the profiles represent 
the error profiles through a sequence of filtered density fields: 
Rf = 1.0 h-^Mpc (black), Rf = 3.0 h^^Mpc, Rf = 6.0 ft-^Mpc 
and Rf = 10.0 h^^Mpc (light blue). Bottom: The radially aver- 
aged relative error profile £2(^)1 for the DTFE density field re- 
construction (solid lines), the NNFE density field reconstruction 
(dashed lines) and the Natural Lognormal Kriging reconstruction 
(dot-dashed lines). The colours correspond to the same Gaussian 
filtered fields as specified for the top panel. 



by fitting the selection function on the basis of the data (see 
sect. 12. 4| ). At close distances this fit is heavily influenced by 
the presence or absence of large superstructures, which eas- 
ily evokes systematic offsets in the local density estimate, 
ft would probably be better to deal with such systematics 
on the basis of volume limited samples at close distances 
R < 150 ;i~^Mpc. 

6.3.3 DTFE, NNFE and Krigmg error profiles 

The bottom panel of Fig. [18] compares the relative error 
trend for the three different reconstruction formalisms. The 
radial error profiles are determined are shown for the four 
filter scales Rf = 1.0, .0,6.0 and 10.0 h'^Mpc. The DTFE 
error profiles are marked by the solid lines, the NNFE ones 
by the dashed lines and the Lognormal Kriging ones by the 
dot-dashed lines. 

For all three methods the errors at the smallest filter 
scale, Rf = 1.0 h~^Mpc, are substantial, hardly ever better 
than 50%. The errors decrease rapidly with filter scale, such 
that at scales Rf > 6 h~^Mpc the density field can be re- 
constructed with reasonable accuracy. Beyond a distance of 
100 ft^^Mpc the relative errors throughout the whole survey 
volume do not exceed the 10% level. 

Overall, we find strikingly small differences between the 
three methods. None performs distinctly better than any of 
the others. On a more detailed level, we may observe two 
differences. Firstly, we see that the Kriging errors at small 
scales are relatively high. Secondly, DTFE appears to per- 
form somewhat better at nearly all scales, except the small- 
est scale Rf = 1.0 h~^Mpc. fn terms of density errors, the 
relatively simple and direct DTFE method seems to work 
best. 



unfiltered field reconstruction. It is a reflection of the fact 
that the intrinsic smoothing length becomes comparable to 
the filter radius. 



6.3.2 Radially averaged error profiles 

By averaging the errors of the magnitude-limited survey 
density field reconstructions over radial shells we obtain an 
idea of error trends as function of distance. To minimize edge 
effects we exclude locations within 15 voxels from the edge 
of the survey volume. 

The radially averaged error profiles for the DTFE den- 
sity field reconstruction are shown in the top panel in Fig. 1181 
It concerns the absolute error ei (solid line) and relative er- 
ror £2 (dashed line) for four different filtered DTFE density 
fields. These are Gaussian filtered fields at filter scales of 
Rf = 1.0, 3.0, 6.0 and 10.0 /i^^Mpc. 

At nearby distances R < 100 h~^Mpc the average error 
trend is that of decreasing absolute and relative errors. This 
is a reflection of the small and unrepresentative survey vol- 
umes involved. The fact that this effect appears to be most 
prominent for the largest filter scale, Rf — 10.0 /i~^Mpc, 
is indicative. At this scale the reconstruction involves only 
a few independent wavevectors that constitute the resulting 
density field. Errors are enhanced by the way in which the 
weight factors w{z) (equation[3l) are determined. We do this 



6.4 Volume limited survey density field 
reconstructions 



A volume limited survey circumvents the resolution compli- 
cations encountered in magnitude limited surveys. Instead 
of having to correct for the steadily decreasing resolution 
at larger distances, volume limited surveys have the advan- 
tage of a uniform sample resolution. This renders the geo- 
metric and topological analysis of structures considerably 
more straightforward. Moreover, in addition to removing 
the problem of varying spatial resolution, the major advan- 
tae of using volume-limited samples is that one is using the 
same type and luminosity of galaxies at different distance. 
The major disadvantage of volume-limited galaxy samples 
is their relatively low resolution, resulting from the necessity 
to uniformly sample galaxies throughout the sample volume. 
In essence, it involves a trade-off between spatial resolution 
and a sufficiently large and representative sample volume. 

To test the performance of density field reconstructions 

on the basis of a volume-limited galaxy redshift survey, we 

take a volume limited sample from the DeLucia mock sam- 

le ba sed on the Millennium simulation iDe Lucia fc Blaizoj 

20071), restricted to a volume in between redshifts z = 0.02— 

0.1 and containing galaxies brighter than Mr — —20.45. 

The resulting galaxy sample can be seen in the bottom 
panel of Fig. 2] (along with the corresponding NNFE density 
field contours). With respect to the magnitude-limited sam- 
ple (central panel), let alone the full sample, it clearly lacks 
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Figure 19. Density field Correlation Diagrams. Le/t: Correlation diagram between full sample DTFE density field reconstruction (ab- 
scissa) and the DTFE density field reconstruction on the basis of the magnitude- limited sample (ordinate). Right: Correlation diagram 
between full sample DTFE density field reconstruction (abscissa) and the DTFE density field reconstruction on the basis of the volume- 
limited sample (ordinate). All fields arc Gaussian filtered on a scale Rf = 3.0 h^^Mpc. 
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Figure 20. Radially averaged error profiles. To-p: radially aver- 
aged profile of absolute error ei(r). Bottom: radially average pro- 
file of relative error e2(r'). Solid lines, both panels: the error of the 
DTFE density field reconstruction on the basis of the magnitude- 
limited sample. Dashed lines, both panels; the error of the DTFE 
density field reconstruction on the basis of the volume-limited 
sample (within the limits of the volume-limited sample volume) . 
Each panel shows the error profiles for three different filter scales: 
Rf = 1.0 (orange), 3.0 (red) and 10.0 h-^Mpc (black). 



Figure 21. Radially averaged error profiles for DTFE, NNFE 
and Kriging density field reconstructions on the basis of volume- 
limited galaxy sample. The relative error e2(r) is plotted for 
three filter scales: Rf = 1.0 (orange), 3.0 (red)and 10.0 h~^Mpc 
(blaclc). Solid lines: DTFE error profiles. Dashed lines: NNFE er- 
ror profiles. Dot-dashed lines: Natural Lognormal Kriging profiles. 



spatial resolution. This may also be appreciated from the 
three corresponding density field reconstructions in Fig. 1121 
The loss of small scale details is obvious: close inspection and 
comparison with the full sample reveals the loss of fine fila- 
mentary features separating small voids. They either merge 
into larger surrounding overdensities or get lost in enlarged 
void regions. 



6.4-1 Density Field Correlation Diagrams 

Figure [19] compares the correlation diagrams for the DTFE 
density field reconstructions on the basis of magnitude (left) 
and volume limited samples (right). Overall, they occur 
rather similar, although there are some important details 
in which they differ. One particular one concerns the more 
extended maximum of the correlation diagram in the case of 
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the magnitude limited sample density field. In other words, 
the errors of the magnitude limited sample density field are 
usually smaller than those for the volume limited sample 
density field. 



6.4-2 Radially averaged error profiles 

Figure [20] compares the profiles of the radially averaged ab- 
solute errors ei(r) feauation l23l and of the radially averaged 
relative errors £2 (r) (eauation l24p for the DTFE density field 
reconstructions on the basis of the magnitude-limited (solid 
lines) and on the basis of the volume-limited surveys (dashed 
lines). The error profiles are assessed for three Gaussian fil- 
ter radii, R; = 1.0, 3.0 and 10.0 /i"^Mpc. 

The first observation from Fig. [20] is that the er- 
rors of the volume-limited sample are uniformly distributed 
throughout the survey volume, as might be expected for a 
statistically uniform sample. We also find that the errors of 
the magnitude-limited sample density field reconstructions 
are systematically lower than those for the volume limited 
sample. This remains so up to the edge of the volume limited 
sample, at i? « 300 /i^^Mpc, where the sampling density of 
the volume limited and magnitude limited sample are equal. 



6.4.3 DTFE, NNFE and Kriging error profiles 

When comparing the error profiles for the three different 
density field reconstructions we hardly find any significant 
differences. This is confirmed by Fig. 1211 which present the 
radially averaged relative error profiles e2{r) for the DTFE, 
NNFE and Kriging reconstructions at three different filter 
radii. 

The uniformity of the errors appears to suggest that 
a major source for the observed errors has to be found in 
the density estimate itself, rather than in the interpolation 
technique. If there are any differences, we may argue that 
these concern a slightly better performance of DTFE. 



7 TOPOLOGICAL ANALYSIS 

To probe the global pattern of the mass distribution we 
turn to the topological structure of the SDSS survey. This 
is largely dependent on the higher order correlations in the 
density field. The error analysis described in the previous 
subsections would not necessarily be able to detect key dif- 
ferences in the large scale topology. In this subsection we will 
seek to evaluate the quality of the topological renderings of 
the density field, where we focus on the structure defined by 
the voids in the cosmic desnsit field. 

There are various approaches to studying the topolog- 
ical structure of the large scale mass distribution. One op- 
tion for characterizing the global topology of the cosmic 
mat ter distribution is i n terms of four Minkow ski function- 
als (JMecke et al.lll994l : [Schmalzing et al.lll999l ). These are 
solidly based on the theory of spatial statistics and also 
have the great advantage of being known analytically in 
the case of Gaussian random fields. In particular, the genus 
of the density field has received substantial attention as a 
strongly discrimi nating factor betw e en intrinsically different 
spatial patterns (iGott et al.l Il98d Il989l: [Park et all I l992l : 
iHovle et all l2002l : iGott et al.ll2008l : izhang et all 120091 ). An 



attempt to extend the scope of the Minkowski functionals 
towards locally defined topological measures of the density 
field has been developed in the SURFGEN pro ject defined 
by S a hni and Shandarin an d their coworkers ( Sahni et al.l 
1 19981 : IShandarinet al.ll2004l '). The main problem for these 
formalisms remains the user-defined, and thus potentially 
biased, nature of the continuous density field inferred from 
the sample of discrete objects. 

Here we specifically address the topology of the SDSS 
galaxy distribution on the basis of the void population. 
To this end, we segment the galaxy distribution into 
void patches by means of the Watershed Void Finder 
(jPlaten et al.ll2007r ) . We test the watershed segmentation of 
the DTFE density field obtained from the magnitude- limited 
mock sample by comparing it to the watershed segmentation 
of the full galaxy sample density field. The topological errors 
are quantified according to the watershed segmentation of 
both fields. 



7.1 Watershed Void Finder (WVF) 

The Watershed Void Finder (WVF) is an implementation of 
the Watershed Transform for segmentation of images of the 
galaxy and matter distribution into distinct re gions and ob- 
jects and the subsequent identification of voids (jPlaten et al.l 
|2003). 

The basic idea behind the watershed transform finds 
its origin in geophysics. It delineates the boundaries of the 
separate domains, the basins, into which yields of e.g. rainfall 
will collect. The analogy with the cosmological context is 
straightforward: voids are to be identified with the basins, 
while the filaments and walls of the cosmic web are the ridges 
separating the voids from each other. 

With respect to the other void finders the watershed 
algorithm has several advantages. Because it is identifies a 
void segment on the basis of the crests in a density field 
surrounding a density minimum it is able to trace the void 
boundary even though it has a distorted and twisted shape. 
Also, because the contours around well chosen minima are 
by definition closed the transform is not sensitive to local 
protrusions between two adjacent voids. The main advan- 
tage of the WVF is that for an ideally smoothed density 
field it is able to find voids in an entirely parameter free 
fashion. 

The WVF consists of eigh t steps, which are extensively 
outlined in lPlaten erahl (J2003). For the success of the WVF 
it is of utmost importance that the density field retains its 
morphological character. To this end, the two essential first 
steps relate directly to DTFE, which guarantees the correct 
representation of the hierarchical nature, the weblike mor- 
phology domi nated by filaments and walls, an d the pres- 
ence of voids (jvan de Weveaert fc Schaadl2008h . Because in 
and around low-density void regions the raw density field 
is characterized by a considerable level of noise, a second 
essential step suppresses the noise by an adaptive smooth- 
ing algorithm which in a consecutive sequence of repetitive 
steps determines the median of densities within the contigu- 
ous Voronoi cell surrounding a point. The determination of 
the median density of the natural neighbours turns out to 
define a stable and asymptotically converging smooth den- 
sity field fit for a proper watershed segmentation. The sub- 
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Figure 22. Watershed Segmentation. The WVF watershed boundaries (white - and black edged) inferred from the Millennium magnitude- 
limited SDSS mock sample. They are superimposed on the density field of the full SDSS mock galaxy sample (coloured), and the contours 
for the overdense regions in the magnitude-limited survey density field, filtered on a Rf = 2.0 h^^Mpc scale. 



300 


'^TS 


^rm^f^.. 




250 


A-^^r 


^W^£^2sx 


^ 


200 

u 

^ 150 


J^^ 


^^M" 


^ 


>- 


-^^^^^■v^&^^Hi^ 


vi\)^^AS^^''^-i 


y 


100 
50 


"■>. ', 


^^^ 














100 



200 



300 
X Mpc/h 



400 



500 



600 




Figure 23. WVF Segmentation Comparison. The WVF segmentation boundaries of the full SDSS mock galaxy sample are indicated by 
red, the ones for the magnitude-limited sample by black lines. The colour of the segments indicate the nature of the topological errors. 
Orange: /aise mergers; Red: false splits. Below: The zoom-ins show two regions with a serious mismatch between the WVF segmentations 
of full and magnitude-limited sample. 
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sequent central step of the WVF formalism consists of the 
application of the discrete watershed transform on this adap- 
tively filtered density field. 

A related tess ellation-ba s ed m ethod for void iden- 
tification, ZOBOV iNevrinckl (120081') . does yield similar 
results as WVF fsee IColberg et all |2008| 'I. It demon- 
strates the successful application of tessellation-based 
techniques to identify structures within the cosmic 
matter distribution. In addition to the WVF and 
ZOBOV there is an array of void iden t ification proce- 
dures fsee e.g. iKauffmann fc Fairalj Il99ll: lEl-Ad fc PiranI 



19971: iHovleet all 



Plionis fc BasilakosI 



2002 



2002 



Arbabi-Bidgoh fc MiilleJ l2002l: 
Patiri et aLlJgOOel: IColberg ef al\ 



20051 : IShandarin ef aLl l2006l: IColberg et all 1200^ ). The 
"voidfinder" algorithm of lEl-Ad fc PiranI ( 1993) has been at 



the basis of most voidfinding methods. However, this suc- 
cesful approach is not able to analyze complex spatial con- 
figurations in which voids may have arbitrary shapes and 
contain a range and variety of substructures, which lies at 
the heart of our analysis. 



7.2 WVF void population maps 

Figure [22] shows the watershed segmentation generated by 
the mock catalogue. It is marked by the watershed bound- 
aries superimposed on the contour maps of the density field 
of the full galaxy sample. These boundaries are visible as the 
white or thick black cell edges (dependent on the width of 
the boundary). The softer contour levels indicate the over- 
dense regions in the Rf = 2.0 /i~^Mpc filtered full sample 
density field. 

For an impression of the ability to infer the correct 
watershed segmentation - and void population - from the 
magnitude-limited survey, we compare it with the segmen- 
tation for the full galaxy sample. In Fig. [23] the watershed 
boundaries of the latter are marked by red edges, while the 
ones for the magnitude-limited sample are marked by the 
black lines. 

At distances up to _R ~ 200 /i~^Mpc there is overall a 
reasonable agreement between the two void segmentations. 
Beyond that distance, we find that the larger segments of the 
magnitude-limited sample - a consequence of the decreas- 
ing structural resolution of the survey - encompass several 
smaller void segments from the full sample. Beyond that 
distance we also find some regions with strong differences 
between the two segmentations. The segmentation within 
the large voids in the two zoom-ins illustrate this. In the 
magnitude-limited survey segmentation, the weaker bridges 
between the smaller voids visible in the full sample have 
vanished. 

It is a clear illustration of the fact that at large distances 
the only topological information retained in the magnitude- 
limited density field is the skeleton defined by the strongest 
and most overdense filaments and walls, locations traced by 
the brightest galaxies. The more tenuous filigree of smaller 
filaments within the low density regions is lost. 




Figure 24. Definition of the false split measure. The four gray 
circles represent the patches from the reconstructed segmentation, 
the black circles would refer to a patch in the original segmenta- 
tion. The gray shaded areas Aj belong to the areas of the three 
reconstructed patches that have an intersection with the black cir- 
cle. The dashed areas represent intersections between the original 
and the reconstructed segmentation »4jnO- 



False-splits and False-mergers. 

We evaluate the performance of the magnitude-limited sur- 
vey watershed segmentation by comparing its void patches 

with those of the full galaxy sa mple. 

As described extensively in JPlaten et al.l ((2007|), the er- 
rors can be classified, to first order, by false splits and false 
mergers. A false split is the situation in which a void segment 
from the reference field splits into two or more voidpatches. 
The reverse situation is that of a false merger, where two 
void segments in the reference field merge into one void- 
patch in the segmentation of the magnitude-limited survey. 

False Split measure 

We define a measure for the false split errors. Starting from 
the void segmentation in the full sample reconstruction, we 
find for each void segment the overlapping void patches in 
the magnitude limited segmentation. The relative fraction of 
overlap of these survey void patches with the original void 
segment is used as a measure of significance of the overlap. 
If there are two or more significant overlaps, we classify the 
configuration as a false split. 

Figure [24] illustrates the above. The large black circle 
represents a voidpatch in the original segmentation, while 
the four gray circles are void segments in the survey seg- 
mentation. The gray shaded areas belong to the areas of the 
three reconstructed patches j that intersect the black circle, 
while the resulting dashed areas represents the intersection 
of the original with each of these void patches. If the sur- 
face area of circle j is equal to Aj , and that of the overlap 
with the black original segment Ajno, then the false split 
measure 

rFS _ -4-, no 
Jj 



A, 



(25) 



7.3 Topological Error Definition: 



decides on whether this is a significant overlap or not. We 
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deem this to be so if //^ ^ 0.6. In Fig. [24l voidl and void2 
would correspond to significant overlaps. VoidS would be 
excluded as such and would only correspond to a slight shift 
of the boundary. Because of the two significantly overlapping 
voids, this configuration would be classified as a false split. 

False Merger measure 

An almost equivalent measure can be defined for a false 
merger. Since by symmetry, a false merger can be considered 
a false split in the full (original) sample segmentation, we 
may simply reverse the definition of the false split measure. 
If Z is a voidpatch in the original field, with area/ volume 
Ai, and its relative overlap with a large voidpatch in the 
magnitude-limited survey is 

AinO 



fl 



Ai 



(26) 



we deem it a significant overlap when /; > 0.6. When 
there are at least two original void segments that overlap 
significantly with one in the magnitude-limited survey, we 
may consider it a false merger. 

Correctly identified voids 

Having defined the false split and false merger errors, we 
may specify the meaning of a correctly identified patch. A 
correct void patch in the magnitude-limited survey is a void 
segment which overlaps for at least 60% with a void segment 
in the original field, as well as the other way around. These 
two conditions prevent a void from being either a false split 
or a false merger. 

7.4 Spatial Distribution Topological Errors 

Figure [23] shows the spatial distribution of the topological er- 
rors. The image contains the watershed segmentation for the 
full galaxy sample, indicated by the black solid lines. These 
are superimposed on the watershed segmentation for the 
magnitude-limited survey, indicated by the red solid lines. 

The false mergers are indicated by orange patches. The 
dark red patches represent the false splits, while the cor- 
rectly reproduced patches are represented as white cells. It 
is directly apparent that false mergers are far more abundant 
than false splits. 

A visual comparison between the two segmentations 
(also cf. Fig. I22|l reveals the disappearance of void bound- 
aries seen in the original full sample matter distribution. 
They disappear within the large void regions found in the 
magnitude limited survey. In other words, these minima are 
absorbed by one large encompassing void. It is a result of 
the tenuous and usually underdense nature of the walls in 
these regions, so that only a few galaxies have to fall out of 
the survey to evoke a merging of voids. 

Nonetheless, the coherence of the large persistent wa- 
tershed lines remains strong throughout the volume. The 
defining skeleton of the cosmic mass distribution remains 
largely intact in a magnitude-limited survey. 

7.5 Topological Error Characteristics 

The quality of the void representation of the survey can 
be inferred from the percentage of correctly identified void- 
patches, as well as from the percentage of topological errors, 
ie. the total number of false splits and false mergers. Here 
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Figure 25. Correct and erroneous void identifications. Solid lines: 
(radially averaged) percentage of correct voidpatch identifications 
as a function of distance R. Dashed lines; (radially averaged) 
percentage of erroneous voidpatch identifications. Blue: DTFE; 
Orange: NNFE; Gray: Kriging. Top. density field reconstructions 
smoothed at Rf = 3.0 /i~^Mpc. Bottom: density field reconstruc- 
tions at Rf = 10.0 h'^Mpc. 



we assess the correct identifications and topological errors 
as a function of distance R. We accomplish this by counting 
the number correctly identified void patches in radial shells, 
along with the number of error patches in the same shells. 

Note that the percentage of correct and of incorrect 
void identification do not always add up to 100 percent, 
since topological errors may be far more complex than just 
a false split and/or false merger. This happens with multiple 
additions or disappearances of void- walls. 
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Figure 26. Correct and erroneous void identifications by DTFE. 
Sofid lines: (radially averaged) percentage of correct voidpatch 
identifications as a function of distance R. Dashed lines: (radi- 
ally averaged) percentage of erroneous voidpatch identifications. 
Various line textures indicate a range of filter scales: Rj = 
1.0 h-lMpc (dark blue), 3.0 h-^Mpc , 6 h'^Mpc and 10 fe-^Mpc 
(lightest blue). 



Fig. [25] plots the identification percentages as function 
of radial distance R, for all three reconstruction techniques 
(DTFE: blue; NNFE: orange; Kriging: gray). The top panel 
concerns the 3 /i~^Mpc filtered field, the bottom panel the 
10 /i~^Mpc filtered field. In the case of the 3 /i~^Mpc field we 
see a gradual and continuous decrease of correct void identi- 
fications as we move outward, while there is a corresponding 
increase of erroneous identifications. The 10 /i~^Mpc filtered 
shows an increase of correct identifications at a very close 
range. This is a consequence of the rapidly rising ability 
to outline the corresponding larger underdensities as we ex- 
pand towards a larger volume starting from the small nearby 
cosmic volume. 

Filter Scale & Void finding 

The DTFE reconstruction has been studied in somewhat 
more detail in Fig. 1261 It plots the percentage of correctly 
and incorrectly identified void segments at a range of filter 
scales, running from Rf — 1.0 ft^^Mpc, Rf — 3.0 ft^^Mpc, 
Rj = 6.0 /i^^Mpc to Rj = 10.0 /i^^Mpc. 

For the larger filter scales, we find that the topology 
is reliably reconstructed within a range of _R ~ 100 — 
300 h~^}Apc, while for the smaller filter scales this seems to 
at a much closer range from _R « — 200 /i~^Mpc. It is also 
interesting to see that on small scales, over the whole radial 
range, the number of correct identifications remains rather 
low and hardly exceeds 60%. On filter scales of 6 /i~^Mpc 
and 10 /i~^Mpc the performance is certainly better and con- 
sistently acceptable over a 200 /i~^Mpc range. Only beyond 
R ~ 300 — 400 /i~^Mpc the number of correct identifications 
drops below the 50% mark. 



7.6 Topology Range 

In general, we find that the number of correctly identified 
voids decreases at larger distances, corresponding with a re- 
lated increase of topological errors with distance. Depen- 
dent on the filter scale, we may define a distance Rmax 
out to which DTFE/WVF manages to identify a reason- 
able amount of voids from a magnitude- limited survey. From 
Fig. [26] we find: 



• Rf =1 h'^Mpc 

• Rf =3 h'^Mpc 

• Rf =6 h'^Mpc 

• Rf =10 ft-^Mpc 



Rrr 
Rn 
Rn 
Rrr 



100 h-^Mpc 
200 ft^^Mpc 
300 ft^^Mpc 
400 /I'^Mpc 



Within this range, the fraction of erroneous void identifica- 
tions remains well below unity: Rmax is roughly estimated 
from Fig. [55] from the scale R at which the solid curves 
(correct identification) and dashed curves (erroneous identi- 
fication) cross. 

Also, overall we find a slightly better performance by 
the DTFE reconstruction. Even though the higher order 
NNFE and Kriging techniques produce smoother void re- 
gions, this does not result in a higher number of correctly 
identified voidpatches. 



8 SDSS-DR6 DENSITY RECONSTRUCTION 

Following the extensive analysis discussed in previous sec- 
tions, we arrive at the application of the assessed technolo- 
gies to the real world galaxy distribution in the 6th data re- 
lease of SDSS. These resulting density maps form the start- 
ing point of the extensive statistical study - starting with 
a study of the one-point density distribution function and 
galaxy bias - presented in the subsequent papers discussing 
the density field and its cosmography. 

The DTFE, NNFE and Kriging density field reconstruc- 
tions based on the (magnitude-limited) galaxy distribution 
in the 6th data release of SDSS are shown in Fig. 1281 The 
density field has been reconstructed on a 512'^ grid, repre- 
senting a spatial resolution of 1.17 /i~^Mpc. 

For reference, in Fig.[2Z]we show the spatial distribution 
of the SDSS DR7 galaxies on which the reconstruction is 
based. They are superimposed as black dots on the DTFE 
density field. 

Besides some minor differences, all three density maps 
show the same prominent features. The higher order NNFE 
and Kriging maps look smoother than the DTFE map. It 
is easy to recognize the almost one-to-one correspondence 
between the DTFE and NNFE map, with the DTFE map 
having the appearance of more noisy version. The Kriging 
map not only contains the same structures and features, 
it also has a more coherent appearance in which we can 
more easily recognize the global weblike morphology of the 
density field. Both the DTFE and NNFE maps have a more 
fragmented appearance. 



9 SUMMARY AND DISCUSSION 

This study is the first in a series in which we analyze the 
structure and topology of the Cosmic Web as traced by the 
Sloan Digital Sky Survey. 
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Figure 27. Galaxy Distribution SDSS DR6. The magnitude-limited galaxy sample of the SDSS DR6 survey is superimposed on the 
DTFE density field contours. The density field section is infinitely thin, the galaxies lie within a slice of 10ft.~^Mpc width. The red coloured 
contour levels represent the underdense regions, at p/pu = [0.001,0.002,0.005,0.01,0.02,0.03,0.05,0.07,0.1,0.2,0.3,0.6,0.7,0.8,1]. The 
heavy black solid line is the mean cosmic density level, 5 = 0. Within its contour are the overdense regions, marked by black contour 
lines at density p/pu = [l.,3.,10.]. 



In this study we investigate the ability of three recon- 
struction techniques to analyze and investigate webhke fea- 
tures and geometries in a discrete distribution of objects. 
The three methods are the Delaunay Tessellation Field Esti- 
mator (DTFE), Natural Neighbour Field Estimator (NNFE) 
and a local "natural" implementation of Kriging, the Nat- 
ural Lognormal Knging technique. DTFE and NNFE are 
based on the local geometry defined by the Voronoi and De- 
launay tessellations of the galaxy distribution. The Kriging 
formalism is adapted and optimized for the the approximate 
lognormal density distribution encountered in the mildly 
nonlinear cosmic web and is based on the logarithm of the 
measured density values. Also, we have chosen to restrict 
the evaluations to a localized neighbourhood, the tetrahedral 
natural neighbourhood based on the Delaunay triangulation 
of the point sample. 

The three reconstruction methods are analysed and 
compared using mock magnitude-limited and volume- 
limited SDSS redshift surveys, obtained on the basis of the 
Millennium simulation. We investigate error trends, biases 
and the topological structure of the resulting fields. The 
reconstructed density fields, mainly from the magnitude- 
limited mock survey samples but also from volume-limited 
ones, are compared with the density field of the total sim- 
ulation galaxy sample. The differences between the various 
field reconstructions are investigated on the basis of an er- 
ror analysis, mostly involving point-to-point comparisons. 
Environmental effects are addressed by evaluating the den- 
sity fields on a range of Gaussian filter scales. With respect 
to the topology of the survey fields, we concentrate on the 
void population id entified by the Watershed Void Finder 
IjPlaten et al.ll2007l ). The void population in the full density 
field and in the magnitude-limited survey density fields are 
compared. The number of false mergers - in which original 



voids emerge as a part of a larger void in the survey field 
- and of false splits - in which an original void splits up in 
one or more voids in the survey field - forms the basis of the 
topological quality evaluation. 

By investigating the quality of the resulting density field 
estimates, over a range of scales and in different environ- 
ments, as well as the more global topological structure of 
the weblike network, we wish to identify and understand 
the qualities of these techniques for the different purposes 
and post-processing steps which are the subject of the fol- 
lowing papers in this series. The following observations were 
made: 

• In most tests, DTFE, NNFE and Kriging have largely 
similar density and topology error behaviour. 

• Cosmetically, higher order NNFE and Kriging methods 
produce more visually appealing reconstructions. 

• Quantitatively, DTFE performs (marginally) better. 
Part of this at first sight surprising finding is a consequence 
of the higher sensitivity of the higher order NNFE and 
Kriging interpolation to intrinsic errors in the galaxy sam- 
ple. An additional factor is the smaller natural neighbour- 
hood of DTFE and NNFE with respect to the 3 to 4 times 
larger neighbourhood of Natural Lognormal Kriging, which 
restricts density errors to smaller volumes. 

• With respect to the topological properties of the recon- 
structed density fields, it has become clear that a successful 
recovery of the void population on small scales is rather dif- 
ficult. On these scales, the removal of only a couple of void 
galaxies leads to the spurious merging of observed voids. 

• The void recovery rate improves significantly at filter 
scales > 3 h~^Mpc. The immediate repercussion is that a 
proper analysis of small scale voids, and void galaxies, has 
to be necessarily restricted to the local Universe out to at 
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Figure 28. SDSS DR6 density field reconstructions. Top: DTFE; Centre: NNFE; Bottom: Kriging. Shown are thin - zero width - 
slice sections through the reconstructed density field. The red coloured contour levels represent the underdense regions, at p/ pu = 
[0.001, 0.002, 0.005, 0.01, 0.02, 0.03, 0.05, 0.07, 0.1, 0.2, 0.3, 0.6, 0.7, 0.8, 1]. The heavy black solid line is the mean cosmic density level, 5 = 0. 
Within its contour are the overdense regions, marked by black contour lines at density p/ p-u, = [1-, 3., 10.]. 



Structural Analysis of the SDSS Cosmic Web, Density Field Reconstruction 33 



most 100 h^^Mpc. As the environmental influences on the 
galaxy form ation process see m to be mostly determined on 
these scales (jPark et al.ll2007l l. our study within this project, 
subject of a forthcoming paper in this series, will restrict 
itself mainly to this volume. 



A variety of technical improvements of our DTFE, 
NNFE and Kriging implementations may lead to a better 
performance. One immediate option is to invoke an image 
grid which has a more natural character for the galaxy sur- 
vey context of our study than the cubic grid we have used in 
the evaluations described in this paper. Because of the flexi- 
bility of their deflnition, the Kriging formalism will be form 
a promising context for further adaptations and improve- 
ments. Of immediate importance is the implementation of 
non-local techniques to optimize the matrix inversion cal- 
culations while retaining the influence of large scale corre- 
lations and predictor-corrector methods to deal with oscil- 
latory instabilities. While we have not yet investigated its 
performance, results of studies in other fields emphasize the 
importance of evaluating the performance of Radial Basis 
Function techniques as a possible alternative. 

The DTFE, NNFE and Kriging density field reconstruc- 
tions form the basis of a series of studies in which we ana- 
lyze the cosmography, void population and spinal structure 
of the local Cosmic Web in the Sloan Digital Sky Survey. 
Given the ease and efficiency of calculation, and its good 
quantitative behaviour, for most of these studies we use the 
DTFE formalism. However, the Natural Lognormal Krig- 
ing results looks very promising and appears to produces 
a well-behave coherent weblike density map of the SDSS 
survey. With the large advantage of controlling the error 
behaviour and properties of the reconstructed map, along 
with the large potential for extensions and optimizations of 
the method, the Natural Lognormal Kriging maps will play 
a dominant role in our study of the Cosmic Web. 

The density field reconstructions within the SDSS DR6, 
and subsequently SDSS DR7, volumes will be subjected to 
a statistical study in the second paper of this series. We 
will focus in particular on the 1-point probability function 
of the SDSS density field. With the help of correspond- 
ing magnitude-limited mock catalogues we will infer in how 
far the lognormal probability function, as well as related 
higher order versions, form a proper description of its sta- 
tistical character. Also, the mock catalogues will allow us 
to assess the bias of the galaxy population in high den- 
sity areas and, in particular, the low density void region. 
A cosmographic description of the reconstructed Local Uni- 
verse, along with the identification of filamentary superclus- 
ter complexes, voids and supervoid complexes is the sub- 
ject of the third paper in this series. The study of the void 
population in the SDSS density field, concerning a detailed 
assessment of the void size and shape properties, forms an 
important rationale behind the development of the density 
field reconstructions described in this study. This has be- 
come an interesting area of research, in particular as recent 
studies have emphasized the potential of extracting cosmo- 
logical information from cosmic voids, in par ticular that 
concerning the dark energy equation of state (jLee fc ParkI 



I2OO9I : iLavaux fc Wandeltl [2OO9I : iBiswas et allboiol ). In re- 
cent years, there has also been a strong interest in large 
scale environmental influences on galaxy properties and on 
the galaxy formation process. The tools described in the 
present study, will allow an assessment on the basis of a 
properly defined density field on quasi-linear scales. 
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APPENDIX A: 

SDSS COORDINATE SYSTEM 

For the analysis of our datasample, we transform the equa- 
torial coordinates {a, 5, z) of the DR6 NGP and DR7 galaxy 
sample to a grid based coordinate system (X,Y,Z). The ob- 
server is located at {X,Y,Z) = (300., 0., 300.) /i^^Mpc, 
while the centre of the northern strip is rotated to lie parallel 
to the Y-axis, starting at {X,Z) = (300,300) /i^^Mpc. The 
corresponding transformation for an object with a comoving 
distance -R(z) = cz/lOO /i~'^Mpc is defined by: 



X = R{z) cos(5) cos(a - 90) 
Y = R{z) cos(5) sin(a - 90) 
Z = _R(z)sin((5), 



(Al) 



APPENDIX B: 

DTFE: DELAUNAY TESSELLATION FIELD 

ESTIMATOR 

An exte nsive outline of the full DTFE p rocedure can be 
found in Ivan de Wevgaert fc Schaad (|2008r ). For the specific 
application to the SDSS density field reconstruction, we fol- 
low the following steps of the DTFE procedure: 

• Point sample 

The (mock) galaxy samples are supposed to represent an un- 
biased sample of the underlying density field. It is therefore 
considered to be a general Poisson process of the underlying 
density field. 

• Boundary Conditions 

We assume vacuum boundary conditions: outside the galaxy 
sample volume we take the minimal assumption of having 
no points. 

• Delaunay Tessellation 

Construct th e Delaunay tessellation from the point sam- 
ple usin g the lComputational Geometry Algorithms Librarvl 
jCGALl) library. 



36 Platen et al. 



• Field values point sample 

The density values at the sampled points are determined 
from the corresponding Voronoi tessellations. The estimate 
of the density at each sample point is the normalized 
inverse of the volume of its contiguous Voronoi cell Wi of 
each point i. The contiguous Voronoi cell of a point i is the 
union of all Delaunay tetrahedra of which point i forms 
one of the four vertices (see Fig. [5] for an illustration). We 
recognize two applicable situations: 

+ Uniform sampling process: 
the point sample is an unbiased sample of the underlying 
density field. Typical example is that of A''-body simula- 
tion particles. For _D-dimensional space the density esti- 
mate is, 



/(xO = {1 + D) 



V{Wi, 



(Bl) 



with rrii the "mass" of sample point i. This situation 
concerns the "full" mock galaxy samples and the volume- 
limited galaxy samples. 

-f Systematic non-uniform sampling process: 
sampling density according to specified selection pro- 
cess quantified by an a priori known selection function 
i/'(x), varying as function of sky position (a, 5) as well 
as depth/redshift. For D-dimensional space the density 
estimate is, 



/(x.) = (1-hD) 



V'(xov(>v.: 



(B2) 



This situation is relevant for the magnitude- or flux- 
limited SDSS and mock galaxy samples. 



the processing steps is the determination of field values 
following the interpolation procedure(s) outlined above. 
Straightforward "first line" field operations are "Image 
reconstruction" and, subsequently, " Smoothing/ Filtering" . 

-\- Image reconstruction: 
For a set of image points, usually grid points, determine 
the image value: formally the average field value within 
the corresponding gridcell. For that purpose in this study 
we use the Monte Carlo approach: approximate the in- 
tegral by taking the average over a number of (interpo- 
lated) field values probed at randomly distributed loca- 
tions within the gridcell around an image point. The final 
estimate is obtained by averaging over the interpolated 
field values within a gridcell. 

For image reconstruction we need to assure ourselves 
that we obtain a sensible density estimate within a voxel 
element. In a spatially irregular sample, the Voronoi cell 
defines the natural Nyquist interval. To avoid aliasing, the 
number of interpolation points therefore needs to over- 
sample the voxels of the image grid. One may also take 
the alternative and exact option of piecewise integrating 
the density of the Delaunay tetrahedra that are (partially) 
overlapping with the image voxel. For DTFE this can be 
accomplished exact and relatively fast, although the com- 
putational and geometric aspects are far from trivial. 

-I- Smoothing and Filtering: 
Linear filtering of the field /: convolution of the field / 
with a filter function Wf{v,y), usually user-specified. 



/.(r) 



f{v')Ws{v',y)dv' 



(B5) 



• Field Gradient 

Calculation of the field gradient estimate V/|m in each 
D-dimensional Delaunay simplex jn (D = 3: tetrahedron; 
D = 2: triangle) by solving the set of linear equations for 
the field values at the positions of the {D + 1) tetrahedron 
vertices, 



V/|. 



/o /i h h 



ro ri r2 ra 
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• Post-processing. 

The real potential of DTFE fields may be found in sophisti- 
cated applications, tuned towards uncovering characteristics 
of the reconstructed fields. An important aspect of this in- 
volves the analysis of structures in the density field. This can 
be finding voids, identifying cosmic structures or advanced 
statistical analysis of the density field. 



• Interpolation. 

The final basic step of the DTFE procedure is the field inter- 
polation. The processing and post-processing steps involve 
numerous interpolation calculations, for each of the involved 
locations r. Given a location r, the Delaunay tetrahedron m 

in which it is embedded is determined. On the basis of the 
field gradient V/|m the field value is computed by (linear) 
interpolation. 



/(r) = /(rO + V/j, -(r-r. 



(B4) 



• Processing. 

We make a distinction between straightforward processing 
steps concerning the production of images and simple 
smoothing filtering operations on the one hand, and more 
complex post-processing on the other hand. Basic to 



